Polycomb genes from maize - Mez1 and Mez2

ABSTRACT

The present invention relates to polycomb genes and polypeptides isolated from  Zea mays.

RELATED APPLICATION INFORMATION

[0001] This application claims priority from U.S. Ser. No. 60/218,745filed on Jul. 17, 2000.

TECHNICAL FIELD OF THE INVENTION

[0002] The present invention relates to plant genetic engineering. Morespecifically, the present invention relates to polycomb nucleic acidscloned from Zea mays L.

BACKGROUND OF THE INVENTION

[0003] In eukaryotes, gene expression patterns are regulated in responseto developmental and environmental cues. These changes in geneexpression patterns are often the result of specific transcriptionalregulators. In many cases, this change in gene expression must be stablymaintained through many mitotic cell divisions even though thetranscriptional regulator that effected the change in expression is onlypresent transiently. The stable maintenance of a transcription state isperformed by a set of nonspecific factors. These factors are importantin regulating chromatin states and establishing a chromatin “memory” toeffectively maintain the proper gene expression patterns. In Drosophila,the Polycomb group, PcG, genes are involved in nonspecific, long-termstabilization of transcriptional repression. Recently, homologs of someof the polycomb group genes have been shown to affect developmental generegulation in other species.

[0004] There are at least thirteen PcG proteins in Drosophila. Mutationsin any of the thirteen identified PcG genes can lead to lethality duringearly development (See, Simon, J., Current Opinion in Cell Biology,7(3):376-85 (1995); Pirrotta, V., Curr. Opin. Gen. Dev., 7(2):249-58(1997); Pirrotta, V., Cell, 93(3):333-6 (1998)). The cause of thislethality is the failure to maintain transcriptional repression ofhomeotic genes of the Antennopedia/bithorax complex. The expressionpattern of these homeotic genes is controlled in the embryo byactivators and repressors that define body segments. Duringgastrulation, these specific factors are no longer present and PcGprotein complexes stabilize a silenced state at genes repressed by thespecific factors. Importantly, PcG complexes silence different targetsin different cell lineages. This indicates that PcG complexes are ableto silence based on factors such as transcription state and not just onsequence. An antagonistic set of factors which maintain the activetranscriptional state, the trithorax group, also exist in Drosophila.

[0005] In addition to playing a role in developmentally regulatedrepression of gene expression, the PcG proteins are also involved inmaintaining a silenced state at other loci. When high copy numbers (>3)of a white-Adh transgene are introduced into the Drosophila genome thelevel of white-Adh expression becomes reduced via cosuppression(Pal-Bhadra et al., Cell, 90:479-490 (1997)). In addition to reductionsin the expression of the transgenes, the expression of the endogenousAdh gene is reduced as well. This cosuppression is relieved by mutationsin polycomb (Pc) or polycomblike (pcl). The cosuppression is based on ahomology sensing mechanism that leads to repression via PcG proteins(Pal-Bhadra et al., Cell, 99:35-46 (1999)). The PcG protein, enhancer ofzeste, E(z), is required for trans-silencing of P-elements (Roche etal., Genetics, 149(4):1839-55 (1998)). Increased expression of E(z) orthe human homolog (EZH2) results in enhancing position effectvariegation (PEV) of a heterochromatin associated white locus (Laible etal., EMBO J., 16(11) 3219-32 (1997)). The EZH2 gene was also able torestore telomere mediated gene repression in S. cerevisiae (Laible etal., EMBO J., 16(11) 3219-32 (1997)). These studies suggest that the PcGproteins can play a role in epigenetic inactivation of gene expressiondistinct from the role of developmental regulation.

[0006] Many of the domains present in the PcG proteins that have beencloned are implicated in protein-protein interactions. The esc and E(z)proteins have been shown to interact with each other in a yeast twohybrid system and through in vitro binding assays (Jones et al., CellBiol., 18(5):2825-34 (1998)). Homotypic and heterotypic interactionsbased on the SPM domain have been documented for Sex combs on midleg(Scm) and ph (Bornemann et al., Development, 122(5):1621-30 (1996);Peterson et al., Mol. Cell Biol., 17(11):6683-92 (1997)). The Xenopus Pchomolog, Xpc, forms complexes with itself and Bmi-1 (a psc homolog)(Reijnen et al., Mech. Dev., 53(1):35-46 (1995)). In other yeasttwo-hybrid screens, ph interacts with itself and with Psc, and Pscinteracts with Pc (Pirotta, V., Curr. Opin. Gen. Dev., 7(2):249-58(1997)). These results indicate the presence of a large complex formedby PcG proteins that is formed based on multiple protein-proteininteractions among various PcG members.

[0007] Recent evidence suggests that PcG proteins actually form twodistinct complexes. One complex contains E(z) and esc which have beenfound to directly interact (van Lohuizen et al., Mol. Cell Biol.,18(6):3572-9 (1998); Jones et al., Mol. Cell Biol., 18(5):2825-34(1998), Sewalt et al., Mol. Cell Biol., 18(6):3586-95 (1998); Ng et al.,Mol. Cell Biol., 20(9):3069-78 (2000)). The second complex is the PRC1complex (which includes Pc/Ph/Scm/Psc).

[0008] Homologs from PcG proteins have been characterized in a number ofspecies. Vertebrates appear to contain the most homologs of PcG proteins(Simon, Current Opinion in Cell Biology, 7(3):376-85 (1995)). Homologsof psc, Pc, ph, E(z) and esc have been cloned in mammals. The role ofPcG proteins in mammals is believed to be very similar to the role inDrosophila.

[0009] While many of the domains present in PcG proteins are found inyeast proteins, no PcG homologs are present in the S. cerevisiae genome.In C. elegans and Arabidopsis, homologs of two PcG proteins, E(z) andesc are found. A SET domain and a cys-rich region are found in E(z)(Carrington et al., Development, 122(12):4073-83 (1996); Jones et al.,Genetics, 126(1):185-99 (1990); Jones, RS, et al., Mol. Cell. Biol.,13(10):6357-66 (1993)). The esc proteins contain a series of seven WD-40repeats (Gutjahr et al., EMBO J., 14(17):4296-306 (1995); Simon et al.,Mech. Devt., 53(2):197-208 (1995)).

[0010] The E(z) and esc homologs (maternal effect sterile-2 (mes-2) andmaternal effect sterile-6 (mes-6)) from C. elegans were identified in ascreen for maternal-effect mutations that result in sterile offspring(Holdeman et al., Development, 125(13):2457-67 (1998), Korf et al.,Development, 125(13):2469-78 (1998)). The mes-2 and mes-6 genes areimplicated as maternal genes required for germline immortality. Bothmes-2 and mes-6 are localized to the nucleus of all embryonic cells andthe nuclei of germline cells in larvae and adults. This localization isdependent upon each other and another protein, mes-3 (Holdeman et al.,Development, 125(13):2457-67 (1998), Korf et al., Development,125(13):2469-78 (1998)). Transgene arrays in the C. elegans genome arefrequently silenced in germline cells (Kelly et al., Development,125(13):2451-6 (1998)). Mutations in mes-2 and mes-6 completelyalleviate silencing of transgenes in the germline cells (Kelly et al.,Development, 125(13):2451-6 (1998). These results suggest that the PcGproteins of C. elegans, mes-2 and mes-6 are involved in transcriptionalrepression specifically in the germline cells. It is likely that mes-2and mes-6 repress transcription of genes that would lead to adifferentiated state.

[0011] Arabidopsis also contains homologs of E(z) and esc (Goodrich etal., Nature, 386(6620):44-51 (1997)), Grossniklaus et al., Science,280(5362):446-50 (1998); Ohad et al., Plant Cell, 11(3):407-16 (1999)).Arabidopsis contains three E(z)-like genes, curly leaf (clf), Medea(Mea) and E(z)-likeA1 (EZA1) and one esc homolog,fertilization-independent endosperm (FIE1).

[0012] Clf mutants display curled leaves, altered maturation times andpartial homeotic transformations of floral tissues (Goodrich et al.,Nature, 386(6620):44-51 (1997)). Ectopic expression is also observed forthe hometoic genes Agamous (AG) and Apetela3 (AP3). These genes arespecifically expressed in floral tissues where clf mRNA is also present.This indicates that, similar to the Drosophila PcG proteins, thepresence of CLF protein is not sufficient to repress AG and AP3transcription but requires targeting factors (Goodrich et al., Nature,386(6620):44-51 (1997)). The homeotic genes AG and AP3 are alsoectopicly expressed in Arabidopsis plants with reduced methylationlevels (Finnegan et al., Proc. Natl. Acad. Sci. USA, 93(16):8449-8454(1996)).

[0013] Medea was identified in a screen for Arabidopsis gametophytelethal mutations (Grossniklaus et al., Science, 280(5362):446-50 (1998);Chaudhury et al., Proc. Natl. Acad. Sci., USA, 94(8):4223-8 (1997); Luoet al., Proc. Natl. Acad. Sci. USA, 96(1):296-301 (1999)). A plantheterozygous for mea mutations will produce 50% aborted seeds thatcollapse and do not germinate. Subsequently it has been found that MEAexhibits an imprinted pattern of gene expression (Kinoshita et al.,Plant Cell, 11(10): 1945-52 (1999)); Vielle-Calzada et al., Genes Dev.,13(22):2971-82 (1999)). The maternal copy of Medea is expressed whilethe paternal copy is not. Medea mutants will allow endosperm developmentto occur in the absence of fertilization (Kiyosue et al., Proc. Natl.Acad. Sci. USA, 96(7):4186-91 (1999)). These results indicate thatmaternal expression of Medea is required to repress endospermdevelopment. Due to the early lethality of Medea mutants, roles forMedea later in plant development have not been determined. A thirdE(z)-like gene, EZA1 is present in the Arabidopsis genome (Preuss, D.,Plant Cell., 11(5):765-8 (1999)). Presently, the function of EZA1 isunknown.

[0014] Mutations in the Arabidopsis esc-like gene, FIE, have phenotypessimilar to Medea. A female gametophyte with a FIE mutant allele willundergo replication of the central cell nucleus and endospermdevelopment without a fertilization event (Ohad et al., Plant Cell,11(3):407-16 (1999)). This indicates that FIE is critical in therepression of endosperm development. As with Medea, due to the earlylethality of FIE mutants, the role of FIE in later developmental eventshas not been determined. The similar phenotypes of FIE and mea mutantssuggests that these two genes may interact functionally like E(z) andesc homologs in other organisms.

SUMMARY OF THE INVENTION

[0015] In one embodiment, the present invention relates to an isolatedand purified nucleic acid comprising a polynucleotide selected from thegroup consisting of SEQ ID NO: 1, SEQ ID NO: 3 and conservativelymodified and polymorphic variants thereof. In addition, the presentinvention relates to an isolated and purified nucleic acid comprising apolynucleotide having at least 60%, 70%, 80%, 90%, or 95% identity to apolynucleotide selected from the group consisting of SEQ ID NO: 1 andSEQ ID NO: 3.

[0016] In yet another embodiment, the present invention relates to anisolated and purified polypeptide comprising an amino acid sequenceselected from the group consisting of SEQ ID NO: 2, SEQ ID NO: 4 andconservatively modified variants thereof. In addition, the presentinvention relates to an isolated and purified polypeptide comprising anamino acid sequence having at least 60%, 70%, 80% or 95% identity to anamino acid sequence selected from the group consisting of: SEQ ID NO: 2and SEQ ID NO: 4.

[0017] In yet a further embodiment, the present invention relates to anexpression cassette containing a promoter sequence operably linked to anisolated and purified nucleic acid comprising a polynucleotide selectedfrom the group consisting of SEQ ID NO: 1, SEQ ID NO: 3 andconservatively modified and polymorphic variants thereof. Preferably,the expression cassette also contains a polyadenylation signal which isoperably linked to the previously described nucleic acid. Examples ofpromoters which can be used in the expression cassette includeconstitutive and tissue specific promoters.

[0018] In yet another embodiment, the present invention relates to abacterial cell containing the hereinbefore described expressioncassette. The bacterial cell can be an Agrobacterium tumefaciens cell oran Agrobacterium rhizogenes cell.

[0019] In still yet another embodiment, the present invention relates toa plant cell transformed with the hereinbefore described expressioncassette, a transformed plant containing such a plant cell, and to seedobtained from such a transformed plant. The plant cell, transformedplant and seed can be from Zea mays L.

BRIEF DESCRIPTION OF THE FIGURES

[0020]FIG. 1 shows the Mez1 polynucleotide and amino acid sequences.FIG. 1A shows that the polynucleotide sequence of the Mez1 cDNA is 3180base pairs (bp). A solid underline indicates that the putative startcodon and the first in-frame stop codon is indicated with a wavyunderline. FIG. 1B shows the 931 amino acid Mez1 protein.

[0021]FIG. 2 shows the Mez2 polynucleotide and amino acid sequences.FIG. 1A shows that the polynucleotide sequence of the Mez2 cDNA is 3030bp. The putative start codon is indicated by a solid underline while thestop codon is indicated by a wavy underline. The location of severalintrons is indicated by open arrowheads above the sequence. Theseintrons were identified by sequencing of PCR products amplified fromgenomic DNA corresponding to bp2032 to bp2587 of the cDNA. The locationof the four Mu insertions are indicated by black arrowheads below thesequence. The Mez2-Mu1 allele contains a Mu element inserted intointron 1. The location of the Mez2-Mu2, Mez2-Mu3 and Mez2-Mu4 Muinsertions are all located in exons. The nucleotides that flank thesequence that is removed by alternative splicing are indicated by adouble underline. FIG. 1B shows the 893 amino acid Mez2 protein.

[0022]FIG. 3 shows the alignment of Mez1 and Mez2. The Mez1 and Mez2protein sequences were aligned using ClustalW(http://dot.imgen.bcm.tmc.edu:9331/multi-align/ Options/clustalw.html).These alignments were then processed using Boxshade to highlightidentical residues in black and similar residues in gray. The twoproteins are 42% identical and 56% similar over their entire length.

[0023]FIG. 4 shows the alignment of E(z) sequences. The sequences ofDrosophila E(z) (AAC46462), human EZH1 (AAC50778), human EZH2(AAC51520), C. elegans MES-2 (AAC27124), Arabidopsis CLF (AAC23781),Arabidopsis MEA (AAC39446), Arabidopsis EZA1 (AAD09108), Mez1 and Mez2were aligned using ClustalW(http://dot.imgen.bcm.tmc.edu:9331/multi-align/Options/clustalw.html).The alignments were colored using Boxshade to highlight identicalresidues in black and conserved residues in gray. The location of aputative bipartite nuclear localization signal in the plant sequences isindicated by *'s above the alignments. # symbols are located above thecysteine-rich region. The N-terminal SET domain is indicated by +symbols above the alignment. A putative SANT DNA binding domain is shownwith ^ symbols. $ symbols are placed above all acidic amino acidresidues in an acidic region near the C-terminus. A region of highconservation in the plant sequences only containing a CRRC sequence isshown with x's above the alignment. The region between the CRRC domainand the nuclear localization signal is very divergent.

[0024]FIG. 5 shows schematic diagrams of E(z)-like proteins. E(z)-likeproteins from plants and the Drosophila E(z) are represented byrectangles with the N-terminus located on the left for each protein. Thelocation of the EZD1, EZD2, SANT, Cys-rich, and SET domains areindicated by shading.

[0025]FIG. 6 shows the alignment of the SET domains from Drosophila E(z)(AAC46462), human EZH1 (AAC50778), human EZH2 (AAC51520), C. elegansmes-2 (AAC27124), Arabidopsis clf (AAC23781), Arabidopsis Mea(AAC39446), Arabidopsis EZA1 (AAD09108), Mez1 and Mez2 using ClustalW(region indicated by [] in FIG. 4). The Arabidopsis sequences areunderlined. The maize sequences are in bold text. Bootstrap values areindicated by the numbers at nodes in the tree. Only nodes with bootstrapvalues greater than 50% are shown.

[0026]FIG. 7 shows that the Mez2 transcript is alternatively spliced indifferent tissues. Three predominant transcripts are found, the fulllength transcript and two smaller transcripts. The two smallertranscripts were isolated and sequenced to reveal the difference betweenthe transcripts. The MEZ2^(a.s.1) transcript is lacking base pairs 1016to 1676 and translation of this sequence results in a truncated proteinof 341 amino acids lacking the conserved C-terminal domains. TheMEZ2^(a.s.2) transcript is lacking base pairs 1016 to 1827 andtranslation of this sequence results in a 624 amino acid protein thatlacks the large variable region from the middle of the MEZ2 protein. TheMEZ2^(a.s.2) transcript has been found as the predominant transcript inembryo and endosperm tissues.

[0027]FIG. 8 shows the results of a RT-PCR analysis of Mez1 and Mez2expression pattern. In FIG. 8A, the primer pair Mez1F1-Mez1R1 was usedto amplify 2 ng of cDNA from various maize tissues. The PCR productswere then separated on a 1% agarose gel stained with ethidium bromide.The arrow indicates the expected size of the PCR product. In FIG. 8B,the primer pair Mez2F4-Mez2R8 was used to amplify 2 ng of cDNA fromvarious maize tissues. The arrows indicate the expected size of Mez2,Mez2^(as1) and Mez2^(as2) isoforms. In FIG. 8C, ubiquitin primers wereused to amplify 0.2 ng of cDNA from the same maize tissues as a control.The pollen cDNA did not allow the amplification of significant amountsof product indicating that the results using this cDNA are questionable.

DEFINITIONS

[0028] Units, prefixes, and symbols can be denoted in the SI acceptedform. Numeric ranges are inclusive of the numbers defining the range.Unless otherwise indicated, nucleic acids are written left to right in5′ to 3′ orientation, respectively. The headings provided herein are notlimitations of the various aspects or embodiments of the invention whichcan be had by reference to the specification as a whole. Accordingly,the terms defined immediately below are more fully defined by referenceto the specification as a whole.

[0029] As used herein, the terms “amplify” or “amplified” as usedinterchangeably herein refer to the construction of multiple copies of anucleic acid sequence or multiple copies complementary to the nucleicacid sequence using at least one of the nucleic acid sequences as atemplate. Amplification methods include the polymerase chain reaction(hereinafter “PCR”; described in U.S. Pat. Nos. 4,683,195 and4,683,202), the ligase chain reaction (hereinafter “LCR”; described inEP-A-320,308 and EP-A-439,182), the transcription-based amplificationsystem (hereinafter “TAS”), nucleic acid sequence based amplification(hereinafter “NASBA”, Cangene, Mississauga, Ontario; described in Proc.Natl. Acad. Sci., USA, 87:1874-1878 (1990); Nature, 350 (No. 6313):91-92 (1991)), Q-Beta Replicase systems, and strand displacementamplification (hereinafter “SDA”). The product of amplification isreferred to as an amplicon.

[0030] As used herein, the term “antibody” includes reference to animmunoglobulin molecule obtained by in vitro or in vivo generation of ahumoral response, and includes both polyclonal and monoclonalantibodies. The term also includes genetically engineered forms such aschimeric antibodies (e.g., humanized murine antibodies), heteroconjugateantibodies (e.g., bispecific antibodies), and recombinant single chainFc fragments (hereinafter “scFc”). The term “antibody” also includesantigen binding forms of antibodies (e.g., Fab¹, F(ab¹)₂, Fab, Fc, and,inverted IgG (See, Pierce Catalog and Handbook, (1994-1995) PierceChemical Co., Rockford, Ill.)). An antibody immunologically reactivewith a particular antigen can be generated in vivo or by recombinantmethods such as by the selection of libraries of recombinant antibodiesin phage or similar vectors (See, e.g. Huse et al., Science,246:1275-1281 (1989); and Ward, et al., Nature, 341:544-546 (1989); andVaughan et al., Nature Biotechnology, 14:309-314 (1996)).

[0031] As used herein, the term “antisense RNA” means an RNA sequencewhich is complementary to a sequence of bases in the mRNA in question inthe sense that each base (or the majority of bases) in the antisensesequence (read in the 3′ to 5′ sense) is capable of pairing with thecorresponding base (G with C, A with U) in the mRNA sequence read in the5′ to 3′ sense.

[0032] As used herein, the term “conservatively modified variants”applies to both amino acid and nucleic acid sequences. With respect toparticular nucleic acid sequences, conservatively modified variantsrefers to those nucleic acids which encode identical or conservativelymodified variants of the amino acid sequences. Because of the degeneracyof the genetic code, a large number of functionally identical nucleicacids encode any given protein. For example, the codons GCA, GCC, GCGand GCU all encode the amino acid alanine. Thereupon, at every positionwhere an alanine is specified by a codon, the codon can be altered toany of the corresponding codons described without altering the encodedpolypeptide. Such nucleic acid variations are “silent variations” andrepresent one species of conservatively modified variation. Everynucleic acid sequence herein which encodes a polypeptide also describesevery possible “silent variation” of the nucleic acid. It is known bypersons skilled in the art that each codon in a nucleic acid (exceptAUG, which is the only codon for the amino acid, methionine; and UGG,which is the only codon for the amino acid tryptophan) can be modifiedto yield a functionally identical molecule. Therefore, each silentvariation of a nucleic acid which encodes a polypeptide of the presentinvention is implicit in each described polypeptide sequence.

[0033] With respect to amino acid sequences, persons skilled in the artwill recognize that individual substitutions, deletions or additions toa nucleic acid, peptide, polypeptide, or protein sequence which alters,adds or deletes a single amino acid or a small percentage of amino acidsin the encoded sequence is a “conservatively modified variant” where thealteration results in the substitution of an amino acid with achemically similar amino acid. Conservative substitution tablesproviding functionally similar amino acids are well known in the art.

[0034] The following six groups each contain amino acids that areconservative substitutions for one another:

[0035] 1) Alanine (A), Serine (S), Threonine (T);

[0036] 2) Aspartic acid (D), Glutamic acid (E);

[0037] 3) Asparagine (N), Glutamine (Q);

[0038] 4) Arginine (R), Lysine (K);

[0039] 5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); and

[0040] 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W). See also,Creighton (1984) Proteins W. H. Freeman and Company.

[0041] As used herein, the term “constitutive promoter” refers to apromoter which is active under most environmental conditions.

[0042] As used herein, the term “full length” when used in connectionwith a specified polynucleotide or encoded protein refers to having theentire amino acid sequence of, a native (i.e. non-synthetic),endogenous, catalytically active form of the specified protein. Methodsfor determine whether a sequence is full length are well known in theart. Examples of such methods which can be used include Northern orWestern blots, primer extension, etc. Additionally, comparison to knownfull-length homologous sequences can also be used to identify fulllength sequences of the present invention.

[0043] As used herein, the term “heterologous” when used to describenucleic acids or polypeptides refers to nucleic acids or polypeptidesthat originate from a foreign species, or, if from the same species, aresubstantially modified from their original form. For example, a promoteroperably linked to a heterologous structural gene is from a speciesdifferent from that from which the structural gene was derived, or, iffrom the same species, is different from any naturally occurring allelicvariants.

[0044] The term “immunologically reactive conditions” as used herein,includes reference to conditions which allow an antibody, generated to aparticular epitope of an antigen, to bind to that epitope to adetectably greater degree than the antibody binds to substantially allother epitopes, generally at least two times above background binding,preferably at least five times above background. Immunologicallyreactive conditions are dependent upon the format of the antibodybinding reaction and typically are those utilized in immunoassayprotocols.

[0045] As used herein, the term “inducible promoter” refers to apromoter which is under environmental control. Examples of environmentalconditions that may effect transcription by inducible promoters includeanaerobic conditions or the presence of light.

[0046] As used herein, the term “isolated” includes reference tomaterial which is substantially or essentially free from componentswhich normally accompany or interact with it as found in its naturallyoccurring environment. The isolated material optionally comprisesmaterial not found with the material in its natural environment.However, if the material is in its natural environment, the material hasbeen synthetically, (e.g. non-naturally) altered by deliberate humanintervention to a composition and/or placed in a locus in a cell (e.g.,genome or subcellular organelle) not native to a material found in thatenvironment.

[0047] Two polynucleotides or polypeptides are said to be “identical” ifthe sequence of nucleotides or amino acid residues, respectively, in thetwo sequences is the same when aligned (either manually for visualinspection or via the use of a computer algorithm or program) formaximum correspondence as described below. The terms “identical” or“percent identity” when used in the context of two or morepolynucleotide or polypeptide sequences, refer to two or more sequencesor subsequences that are the same or have a specified percentage ofamino acid residues or nucleotides that are the same, when compared andaligned for maximum correspondence over a comparison window, as measuredusing one of the following sequence comparison algorithms or by manualalignment and visual inspection. With respect to polypeptides orproteins having a “percent identity” or “percentage of sequenceidentity” one skilled in the art would recognize that residue positionsthat are not identical often differ by conservative amino acidsubstitutions, where amino acid residues are substituted for other aminoacid residues possessing similar chemical and/or physical propertiessuch as charge or hydrophobicity and therefore do not change thefunctional properties of the molecule. Where sequences differ inconservative substitutions, the percent sequence identity may beadjusted upwards to correct for the conservative nature of thesubstitution. Means for making this adjustment are well-known to personsskilled in the art. Typically this involves scoring a conservativesubstitution as a partial rather than a full mismatch, therebyincreasing the percentage sequence identity.

[0048] As used herein, the term “comparison window” includes referenceto a contiguous and specified segment of a polynucleotide sequence,wherein the polynucleotide sequence may be compared to a referencesequence and wherein the portion of the polynucleotide sequence in thecomparison window may comprise additions or deletions (e.g., gaps)compared to the reference sequence (which does not comprise additions ordeletions) for optimal alignment of the two sequences. Generally, thecomparison window is at least 20 contiguous nucleotides in length, andcan be 30, 40, 50, 100, or even longer. Persons skilled in the art willrecognize that to avoid a high similarity to a reference sequence due toinclusion of gaps in the polynucleotide sequence a gap penalty istypically introduced and is subtracted from the number of matches.

[0049] The alignment of polynucleotide and/or polypeptide sequences forthe purposes of determine sequence identity and similarity can be byeither manual alignment and visual inspection or via the use of sometype of computer program or algorithm. In fact, a number of computerprograms are available which can be used to align polynucleotide and/orpolypeptide sequences are known in the art. For example, the programsavailable in the Wisconsin Sequence Analysis Package, Version 9(available from the Genetics Computer Group, Madison, Wis., 52711), suchas GAP, BESTFIT, FASTA and TFASTA. For example, the GAP program iscapable of calculating both the identity and similarity between twopolynucleotide or two polypeptide sequences. Specifically, the GAPprogram uses the homology alignment algorithm of Needleman and Wunsch(J. Mol. Biol., 48:443-453 (1970)). Another example of a useful computerprogram is PILEUP. PILEUP creates a multiple sequence alignment from agroup of related sequences using progressive, pairwise alignments toshow relationship and percent sequence identity. It also plots a tree ordendogram showing the clustering relationships used to create thealignment. PILEUP uses a simplification of the progressive alignmentmethod of Feng & Doolittle, J. Mol. Evol., 35:351-360 (1987). Yetanother example of a useful computer program that can be used fordetermine percent sequence identity and sequence similarity is the BLASTalgorithm (Altsuchul et al., J. Mol. Biol., 215:403-410 (1990)). Thesoftware for performing BLAST analysis is publicly available through theNational Center for Biotechnology Information(http:\\www.ncbi.nlm.nih.gov/).

[0050] With respect to polynucleotide sequences, the term “substantialidentity” means that a polynucleotide comprises a sequence that has atleast 60% sequence identity, preferably at least 70% sequence identity,more preferably at least 80% sequence identity, even more preferably 90%sequence identity and most preferably at least 90% sequence identity,compared to a reference sequence using one of the alignment programsdescribed herein conducted according to standard parameters. One skilledin the art will recognize that these values can be appropriatelyadjusted to determine corresponding identity of proteins encoded by twonucleotide sequences by taking into account codon degeneracy, amino acidsimilarity, reading frame positioning and the like. Substantial identityof amino acid sequences for these purposes normally means sequenceidentity of at least 60%, more preferably at least 70%, 80%, 90%identity, and most preferably at least 95% identity.

[0051] Polynucleotide sequences can also be considered to besubstantially identical if two molecules hybridize to each other understringent conditions. However, polynucleotides which do not hybridize toeach other under stringent conditions are still substantially identicalif the polypeptides which they encode are substantially identical. Thiscan occur when a copy of a polynucleotide is created using the maximumcodon degeneracy permitted by the genetic code. One indication that twopolynucleotide sequences are substantially identical if the polypeptideencoded by the first nucleic acid encodes is immunologically crossreactive with the polypeptide encoded by the second polynucleotide.

[0052] With peptides, the term “substantial identity” as used hereinmeans that a peptide comprises a sequence having at least 60% sequenceidentity to a reference sequence, preferably 70% sequence identity, morepreferably 80% sequence identity, even more preferably 90% sequenceidentity, and most preferably at least 95% sequence identity to thereference sequence over a specified comparison window. Preferably,optimal alignment is conducted using the homology alignment algorithm(GAP program discussed previously) of Needleman and Wunsch, J. Mol.Biol., 48: 443-453 (1990). An indication that two peptide sequences aresubstantially identical is that one peptide is immunologically reactivewith antibodies raised against the second peptide. Thereupon, a peptideis substantially identical to a second peptide where the two peptidesdiffer only by a conservative substitution. Peptides which are“substantially similar” share sequences as described above except thatany residue positions which are not identical differ only byconservative amino acid changes.

[0053] As used herein, the term “Mez1 gene” refers to a gene of thepresent invention, specifically, the heterologous genomic form of a fulllength Mez1 polynucleotide.

[0054] As used herein, the term “Mez1 nucleic acid” refers to a nucleicacid of the present invention, specifically, a nucleic acid comprising apolynucleotide of the present invention encoding a Mez1 polypeptide(hereinafter “Mez1 polynucleotide”). An example of a Mez1 polynucleotide(cDNA) is shown in SEQ ID NO: 1.

[0055] As used herein, the terms “Mez1 polypeptide”, “Mez1 peptide” or“Mez1 protein” as used interchangeable herein refer to a polypeptideshown in SEQ ID NO: 2. The term also includes fragments, variants,homologs, alleles or precursors (e.g., preproproteins or proproteins)thereof.

[0056] As used herein, the term “Mez2 gene” refers to a gene of thepresent invention, specifically, the heterologous genomic form of a fulllength Mez2 polynucleotide.

[0057] As used herein, the term “Mez2 nucleic acid” refers to a nucleicacid of the present invention, specifically, a nucleic acid comprising apolynucleotide of the present invention encoding a Mez2 polypeptide(hereinafter a “Mez2 polynucleotide”). An example of a Mez2polynucleotide (cDNA) is shown in SEQ ID NO: 3.

[0058] As used herein, the terms “Mez2 polypeptide”, “Mez2 peptide” or“Mez2 protein” as used interchangeably herein refer to a polypeptideshown in SEQ ID NO: 4. The term also includes fragments, variants,homologs, alleles or precursors (e.g., preproproteins or proproteins)thereof. A “Mez2 protein” is a protein of the present invention andcomprises a Mez2 polypeptide.

[0059] As used herein, the term “nucleic acid” refers to adeoxyribonucleotide or ribonucleotide polymer in either single- ordouble-stranded form, and unless otherwise limited, encompasses knownanalogues having the essential nature of natural nucleotides in thatthey hybridize to single-stranded nucleic acids in a manner similar tonaturally occurring nucleotides (e.g., peptide nucleic acids).

[0060] As used herein, the term “nucleotide(s)” refers to amacromolecule containing a sugar (either a ribose or deoxyribose), aphosphate group and a nitrogenous base.

[0061] As used herein, the term “operably linked” includes reference toa functional linkage between a promoter and a second sequence, whereinthe promoter sequence initiates and mediates transcription of the DNAsequence corresponding to the second sequence. Generally, operablylinked means that the polynucleotide sequences being linked arecontiguous and, where necessary to joint two protein coding regions,contiguous and in the same reading frame.

[0062] As used herein, the term “plant” includes reference to wholeplants, plant organs (e.g., leaves, stems, flowers, roots, etc.), seedsand plant cells and progeny of the same. Plant cell, as used herein,includes, but is not limited to, suspension cultures, embryos,meristematic regions, callus tissue, shoots, gametophytes, sporophytes,pollen and microspores. The class of plants which can be used in themethods of the present invention are generally as broad as the class ofhigher plants amenable to transformation techniques, includingangiosperms (monocotyledonous and dicotyledonous plants) as well asgymnosperms (e.g. Coniferophyta (conifers, Cycadophyta (cycads),Ginkgophyta (maidenhair tree) and Gnetophyta (gnetophytes)). The term“plant” as used herein also includes plants of a variety of ploidylevels, such as polyploid, diploid, haploid and hemizygous.

[0063] As used herein, the term “plant promoter” refers to a promotercapable of initiating transcription in plant cells.

[0064] As used herein, the term “polymorphic variant” in connection witha polynucleotide sequence refers to a variation in the polynucleotidesequence of a particular gene between individuals of a given species.Polymorphic variants may also encompass “single nucleotidepolymorphisms” (SNPs) in which the polynucleotide sequence varies by onebase. The presence of SNPs may be indicative of a certain population fora disease state or propensity for a disease state.

[0065] As used herein, the term “polynucleotide” refers to adeoxyribopolynucleotide, ribopolynucleotide, or analogs thereof thathave the essential nature of a natural ribonucleotide in that theyhybridize, under stringent hybridization conditions, to substantiallythe same nucleotide sequence as naturally occurring nucleotides and/orallow translation into the same amino acid(s) as the naturally occurringnucleotide(s). A polynucleotide can be full length or a subsequence of anative or heterologous structural or regulatory gene. Unless otherwiseindicated, the term includes reference to the specified sequence as wellas the complementary sequence thereof. Thereupon DNAs or RNAs withbackbones modified for stability or for other reasons are“polynucleotides” as that term is intended herein. Moreover, DNAs orRNAs comprising unusual bases, such as inosine, or modified bases, suchas tritylated bases, to name just two examples, are polynucleotides asthe term is used herein. As used herein, the term polynucleotideincludes such chemically, enzymatically or metabolically modified formsof polynucleotides, as well as the chemical forms of DNA and RNAcharacteristic of viruses and cells, including, but not limited to,simple and complex cells.

[0066] As used herein, the terms “polypeptide”, “peptide” and “protein”are used interchangeably herein to refer to a polymer of amino acidresidues. The terms apply to amino acid polymers in which one or moreamino acid residue is an artificial chemical analogue of a correspondingnaturally occurring amino acid, as well as to naturally occurring aminoacid polymers. The essential nature of such analogues of naturallyoccurring amino acids is that, when incorporated into a protein, thatprotein is specifically reactive to antibodies elicited to the sameprotein but consisting entirely of naturally occurring amino acids. Theterms “polypeptide “, “peptide” and “protein” are also inclusive ofmodifications including, but not limited to, glycosylation, lipidattachment, sulfation, gamma-carboxylation of glutamic acid residues,hydroxylation and ADP-ribosylation.

[0067] As used herein, the term “promoter” refers to a region of DNAupstream from the start of transcription and involved in recognition andbinding of RNA polymerase and other proteins to initiate transcription.A promoter can optionally include distal enhancers or repressor elementswhich can be located several thousand base pairs from the start site oftranscription.

[0068] As used herein, the term “recombinant” includes reference to acell, or nucleic acid, or vector, that has been modified by theintroduction of a heterologous nucleic acid or the alteration of anative nucleic acid to a form not native to that cell, or that the cellis derived from a cell so modified. For example, recombinant cellsexpress genes that are not found within the native (non-recombinant)form of the cell or express native genes that are otherwise abnormallyexpressed, under expressed or not expressed at all.

[0069] As used herein, the term “recombinant expression cassette” is anucleic acid construct, generated recombinantly or synthetically, with aseries of specified nucleic acid elements which permit transcription ofa particular nucleic acid in a target cell. The expression vector can bepart of a plasmid, virus, or nucleic acid fragment. Typically, therecombinant expression cassette portion of the expression vectorincludes a nucleic acid to be transcribed, and a promoter.

[0070] As used herein, the terms “residue” or “amino acid” or “aminoacid residue” are used interchangeably herein to refer to an amino acidthat is incorporated into a protein, polypeptide or peptide. The aminoacid may be a naturally occurring amino acid, and unless otherwiselimited, may encompass known analogs of natural amino acids that canfunction in a similar manner as naturally occurring amino acids.

[0071] As used herein, the term “selective hybridization” or“selectively hybridizes” are used interchangeably herein includesreference to hybridization, under stringent hybridization conditions, ofa nucleic acid sequence to a specified nucleic acid target sequence to adetectably greater degree (e.g., at least 2-fold over background) thanits hybridization to non-target nucleic acid sequences and to thesubstantial exclusion of non-target nucleic acids. Selectivelyhybridizing sequences typically have about at least 80% sequenceidentity, preferably 90% sequence identity, and most preferably 100%sequence identity (e.g., complementary) with each other.

[0072] As used herein, the term, “specifically binds” includes referenceto the preferential association of a ligand, in whole or part, with aparticular target molecule (i.e., “binding partner” or “binding moiety”relative to compositions lacking that target molecule). It is, ofcourse, recognized that a certain degree of non-specific interaction mayoccur between a ligand and a non-target molecule. Nevertheless, specificbinding, may be distinguished as mediated through specific recognitionof the target molecule. Typically, specific binding results in a muchstronger association between the ligand and the target molecule thanbetween the ligand and non-target molecule. Specific binding by anantibody to a protein under such conditions requires an antibody that isselected for its specificity for a particular protein. The affinityconstant of the antibody binding site for its cognate monovalent antigenis at least 10⁷, usually at least 10⁹, more preferably at least 10¹⁰,and most preferably at least 10¹¹ liters/mole.

[0073] As used herein, the terms “stringent hybridization” conditions or“stringent conditions” refers to conditions under which a probe willhybridize to its target subsequence, typically in a complex mixture ofnucleic acid, but to no other sequences. Stringent conditions aresequence dependent and are different under different environmentalparameters. An extensive guide to hybridization of nucleic acids isfound in Tijssen (1993) Laboratory Techniques in Biochemistry andMolecular Biology-Hybridization with Nucleic Acid Probes Part 1, Chapter2 “Overview of Principles of Hybridization and the Strategy of NucleicAcid Probe Assays” Elsevier, N.Y. Generally, highly stringent conditionsare selected to be about 5° C.-10° C. lower than the thermal meltingpoint (T_(m)) for the specific sequence at a defined ionic strength andpH. The T_(m) is the temperature (under defined ionic strength and pHand nucleic concentration) at which 50% of the target sequencehybridizes to a perfectly matched probe. Stringent conditions are thosein which the salt concentration is less than about 1.0M sodium ion,typically about 0.01 to 1.0M sodium ion concentration (or other salts)at a pH of 7.0 to 8.3 and at a temperature of at least about 30° C. forshort probes (such as those having a length between about 10 to 50nucleotides) and at least about 60° C. for long probes (such as thosehaving a length greater than 50 nucleotides). In contrast, lowstringency conditions are at about 15-30° C. below the T_(m) . Stringenthybridization conditions are sequence-dependent and will be different indifferent circumstances. Longer sequences hybridize at highertemperatures.

[0074] As used herein, the term “tissue-specific promoter” includesreference to a promoter in which expression of an operably linked geneis limited to a particular tissue or tissues.

[0075] As used herein, the term “transgenic plant” includes reference toa plant modified by introduction of a heterologous polynucleotide.Generally, the heterologous polynucleotide is a Mez1 or Mez2 structuralor regulatory gene or subsequences thereof.

SEQUENCE LISTINGS

[0076] The present application also contains a sequence listing thatcontains twenty (20) sequences. The sequence listing contains nucleotidesequences and amino acid sequences. For the nucleotide sequences, thebase pairs are represented by the following base codes: Symbol Meaning AA; adenine C C; cytosine G G; guanine T T; thymine U U; uracil M A or CR A or G W A or T/U S C or G Y C or T/U K G or T/U V A or C or G; notT/U H A or C or T/U; not G D A or G or T/U; not C B C or G or T/U; not AN (A or C or G or T/U)

[0077] The amino acids shown in the application are in the L-form andare represented by the following amino acid-three letter abbreviations:Abbreviation Amino acid name Ala L-Alanine Arg L-Arginine AsnL-Asparagine Asp L-Aspartic Acid Asx L-Aspartic Acid or Asparagine CysL-Cysteine Glu L-Glutamic Acid Gln L-Glutamine Glx L-Glutamine orGlutamic Acid Gly L-Glycine His L-Histidine Ile L-Isoleucine LeuL-Leucine Lys L-Lysine Met L-Methionine Phe L-Phenylalanine ProL-Proline Ser L-Serine Thr L-Threonine Trp L-Tryptophan Tyr L-TyrosineVal L-Valine Xaa L-Unknown or other

Introduction

[0078] The present invention is based, at least in part, on thediscovery and cloning of two (2) PcG genes from Zea mays L. (maize)termed the Mez1 gene and the Mez2 gene. The protein encoded by the Mez1gene has been mapped to chromosome 6 (bin 6.01-6.02) and the protein forthe Mez2 gene has been mapped to chromosome 9 (bin 9.04).

[0079] The present invention is applicable to a broad range of types ofplants, including, but not limited to, Zea mays L., Oryza sativa, Secalecereale, Triticum aestivum, Daucus carota, Brassica oleracea, Cucumismelo, Cucumis sativus, Latuca sativa, Solanum tubersoum, Lycopersiconesculentum, Phaseolus vulgaris, and Brassica napus.

Nucleic Acids

[0080] In one embodiment, the present invention relates to isolatednucleic acids of DNA, RNA, and analogs and/or chimeras thereof,comprising a polynucleotide, wherein said polynucleotide is a Mez1 orMez2 polynucleotide which encodes a polypeptide of SEQ ID NO: 2 (a Mez1polypeptide) or SEQ ID NO: 4 (a Mez2 polypeptide), and conservativelymodified variants thereof. It is known in the art that the degeneracy ofthe genetic code allows for a plurality of polynucleotides to encode forthe identical amino acid sequence. These “silent variations”, as theyare common referred to, can be used to selectively hybridize and detectpolymorphic variants of the polynucleotides of the present invention.

[0081] An example of a Mez1 polynucleotide which encodes the Mez1polypeptide of SEQ ID NO: 2 is shown in SEQ ID NO: 1. The polynucleotideof SEQ ID NO: 1 is 3180 base pairs in length.

[0082] An example of a Mez2 polynucleotide which encodes the Mez2polypeptide of SEQ ID NO: 4 is shown in SEQ ID NO: 3. The polynucleotideof SEQ ID NO: 3 is 3030 base pairs in length.

[0083] The Mez2 polynucleotide of SEQ ID NO: 3, in addition to encodingfor the Mez2 polypeptide, contains two (2) alternative splice sites.These alternative splice sites are referred to herein as Mez2alternative splice 1 (“Mez2^(as1”)) (SEQ ID NO: 5) and Mez2 alternativesplice 2 (“Mez2^(as2”)) (SEQ ID NO: 6). The polynucleotide sequence ofMez2^(as1) (hereinafter Mez2^(as1) polynucleotide”) is identical to theMez2 polynucleotide of SEQ ID NO: 3 except that Mez2^(as1)polynucleotide is missing a fragment of 659 basepairs in length.Specifically, this deleted fragment corresponds to 1016 to 1676 in theMez2 polynucleotide of SEQ ID NO: 3. The Mez2^(as1) polynucleotidedeletion causes a frameshift and a truncated protein of 341 amino acidswhich is missing the SANT, nuclear localization signal, cysteine richregion and SET domains (See FIG. 7).

[0084] The polynucleotide sequence of Mez2^(as2) (hereinafter Mez2^(as2)polynucleotide”) is identical to the Mez2 polynucleotide of SEQ ID NO: 3except that Mez2^(as2) polynucleotide is missing a fragment of 810basepairs in length. Specifically, this deleted fragment corresponds to1016 to 1827 in the Mez2 polynucleotide of SEQ ID NO: 3. The Mez2^(as2)polynucleotide deletion does not result in a frameshift. The deletion inMez2^(as2) results in a 624 amino acid protein that is missing the SANTdomain (See FIG. 7).

[0085] In another embodiment, the present invention also providesisolated of nucleic acids comprising polynucleotides encodingconservatively modified variants of a Mez1 or Mez2 polypeptides of SEQID NOS: 2 and 4. Such conservatively modified variants can be used for anumber of useful purposes, such as, but not limited to, the generationor selection of antibodies immunoreactive to the non-variantpolypeptide. Also, in yet another embodiment, the present invention alsorelates to isolated nucleic acids comprising polynucleotides encodingone or more polymorphic variants of polypeptides/polynucleotides.Polymorphic variants are used to follow the segregation of chromosomeregions and are typically used in marker assisted selection methods forcrop improvement.

[0086] In another embodiment, the present invention relates to theisolation nucleic acids comprising polynucleotides of the presentinvention which selectively hybridize, under selective hybridizationconditions (i.e. stringent hybridization conditions), to the Mez1 orMez2 polynucleotide. The isolation of such nucleic acids can beaccomplished by a number of techniques. For example, oligonucleotideprobes based upon the Mez1 and Mez2 polynucleotides described herein canbe used to identify, isolate or amplify partial or full length clones ina deposited library (such as a cDNA or genomic DNA library). Forexample, a cDNA or genomic library can be screened using a probe basedupon the sequence of the Mez1 or Mez2 polynucleotides described herein.These probes can be used to hybridize with genomic DNA or cDNA sequencesto isolate homologous genes in the same or different plant species.

[0087] Alternatively, nucleic acids of interest can be amplified fromnucleic acid samples using various amplification techniques known in theart. For example, PCR can be used to amplify the sequences of the Mez1or Mez2 genes directly from genomic DNA, from cDNA, from genomiclibraries or cDNA libraries. PCR and other in vitro amplificationmethods (such as LCR, etc.) can be used to clone nucleic acid sequencesthat code for proteins to be expressed, to make nucleic acids for use asprobes for detecting the presence of the desired mRNA in samples, fornucleic acid sequencing or for other purposes.

[0088] In yet another embodiment, the present invention relates toisolated nucleic acid comprising polynucleotides, wherein thepolynucleotides of said nucleic acid have a specified identity at thenucleotide level to the previously described Mez1 or Mez2polynucleotides. The percentage of identity is at least 60%, preferably70%, more preferably 80%, even more preferably 90% and most preferably95%.

[0089] In yet another embodiment, the present invention relates toisolated nucleic acids comprising polynucleotides complementary to thepreviously described Mez1 or Mez2 polynucleotides. One skilled in theart will recognize that complementary sequences will base pairthroughout their entire length with the previously described Mez1 orMez2 polynucleotides (meaning that they have 100% sequence identity overtheir entire length). Complementary bases associate through hydrogenbonding in double stranded nucleic acids. Base pairs known to becomplementary include the following: adenine and thymine, guanine andcytosine and adenine and uracil.

[0090] In yet another embodiment, the present invention relates toisolated nucleic acids comprising polynucleotides which comprise atleast 15 contiguous bases from the previously described Mez1 or Mez2polynucleotides. More specifically, the length of the polynucleotidescan be from about 15 continguous bases to the length of the Mez1 or Mezpolynucleotide from which the polynucleotide is a subsequence of. Forexample, such polynucleotides can be 15, 35, 55, 75, 95, 100, 200, 400,500, 750, etc. continguous nucleotides in length from the previouslydescribed Mez1 or Mez2 polypeptide. In addition, such subsequences canoptionally comprise or lack certain structural characteristics from theMez1 or Mez2 polynucleotides from which it is derived.

Polypeptides

[0091] In one embodiment, the present invention relates to a Mez1polypeptide of SEQ ID NO: 2. The Mez1 polypeptide is 931 amino acids inlength, has a molecular weight of about 103.75 kDa and an isoelectricpoint of 8.91.

[0092] In a second embodiment, the present invention relates to a Mez2polypeptide of SEQ ID NO: 4. The Mez2 polypeptide is 893 amino acids inlength, has a molecular weight of about 100.01 kDa and an isoelectricpoint of 8.47.

[0093] The Mez1 and Mez2 polypeptides contain a number of domains. Thesedomains are: EZD1, EZD2, SANT domain, cysteine rich region and SETdomain (See, FIG. 5). The EZD1 and EZD2 regions are conserved domainsspecific to the E(z) family. EZD1 is a highly conserved acidic region of74 amino acids in the N-terminal region. The EZD1 domain contains asignificant proportion of charged residues (34-39%) with seven moreacidic residues than basic residues. The function of this domain ispresently not known. The EZD1 is highly conserved between Mez1, Mez2,clf and EZA1. EZD2 is a small, highly conserved region of 44 amino acidsnear amino acid 250 of the plant and animal E(z)-like proteins. The EZD2region is composed primarily of polar or charged residues. There are two(2) regions near the C-terminus of these protein are well conservedamong all E(z) proteins (See FIG. 5). These are the cysteine rich regionand the SET domain. The Cys-rich region has fiften invariant cysteineresidues with a conserved spacing pttern in all E(z) homologs. Thespacing of the cystein residues in all E(z) homologs is unique and isdifferent from other Cys-rich zinc finger domains involved in DNAbinding. The function of the cysteine rich domain is not known but it ishighly conserved among all E(z)-like genes. The SET domain is alsohighly conserved and is believed to be involved in mediatingprotein-protein interactions (Cui et al, Nat. Genet., 18:331-337 (1998);Huang et al., J. Biol. Chem., 273:15933-15939 (1998)). The SANT bindingdomain is often invovled in non-specific DNA binding (Aasland, R., etal., Trends Biochem. Sci., 21(3):8-88 (1996)).

[0094] In another embodiment, the present invention relates to apolypeptide having a specified percentage of sequence identity with theMez1 or Mez2 polypeptide of the present invention. The percentage ofsequence identity is at least 60%, preferably 70%, more preferably 80%,even more preferably 90% and most preferably 95%.

[0095] The present invention also provides antibodies which specificallyreact with the Mez1 or Mez2 polypeptides of the present invention underimmunologically reactive conditions. An antibody immunologicallyreactive with a particular antigen can be generated in vivo or byrecombinant methods such as by selection of libraries of recombinantantibodies in phage or similar vectors.

[0096] Many methods of making antibodies are known to persons skilled inthe art. A number of immunogens can be used to produce antibodiesspecifically reactive to the isolated Mez1 or Mez2 polypeptides of thepresent invention under immunologically reactive conditions. An isolatedrecombinant, synthetic, or native isolated Mez1 or Mez2 polypeptide ofthe present invention is the preferred immunogens (antigen) for theproduction of monoclonal or polyclonal antibodies.

[0097] The Mez1 or Mez2 polypeptide can be injected into an animalcapable of producing antibodies. Either monoclonal or polyclonalantibodies can be generated for subsequent use in immunoassays tomeasure the presence and quantity of the Mez1 or Mez2 polypeptide.Methods of producing monoclonal or polyclonal antibodies are known topersons skilled in the art (See, Coligan, Current Protocols inImmunology Wiley/Greene, NY (1991); Harlow and Lane, Antibodies: ALaboratory Manual Cold Spring Harbor Press, NY (1989)); and GodingMonoclonal Antibodies: Principles and Practice (2d ed.) Academic Press,New York, N.Y. (1986)).

[0098] The Mez1 or Mez2 polypeptides and antibodies can be labeled byjoining, either covalently or non-covalently, a substance which providesfor a detectable signal. A wide variety of labels and conjugationtechniques are known to persons skilled in the art. Suitable labelsinclude radionucleotides, enzymes, substrates, cofactors, inhibitors,fluorescent moieties, chemiluminescent moieties, magnetic particles, andthe like. Patents teaching the use of such labels include U.S. Pat. Nos.3,817,837; 3,850,752; 3,939,350; 3,996,345; 4,277,437; 4,275,149; and4,366,241.

[0099] The antibodies of the present invention can be used to screenplants for the expression of the Mez1 or Mez2 polypeptides of thepresent invention. The antibodies of the present invention can also beused for affinity chromatography for the purpose of isolating Mez1 orMez2 polypeptides.

[0100] The present invention further provides Mez1 or Mez2 polypeptidesthat specifically bind, under immunologically reactive conditions, to anantibody generated against a defined immunogen, such as an immunogenconsisting of the Mez1 or Mez2 polypeptides. Immunogens will generallyhave a length of at least 10 contiguous amino acids from the Mez1 orMez2 polypeptides of the present invention, respectively.

[0101] A variety of immunoassay formats are appropriate for selectingantibodies specifically reactive with a particular protein. For example,solid-phase ELISA immunoassays are routinely used to select monoclonalantibodies specifically reactive with a protein (See Harlow and Lane,Antibodies, A Laboratory Manual, Cold Spring Harbor Publications, NewYork (1988), for a description of immunoassay formats and conditionsthat can be used to determine specific reactivity). The antibody may bepolyclonal but preferably is monoclonal. Generally, antibodiescross-reactive to Mez1 or Mez2 polypeptides are removed byimmunoabsorbtion.

[0102] Immunoassays in the competitive binding format are typically usedfor cross-reactivity determinations. For example, an immunogenic Mez1 orMez2 polypeptide can be immobilized to a solid support. Polypeptidesadded to the assay compete with the binding of the antisera to theimmobilized antigen. The ability of the above polypeptides to competewith the binding of the antisera to the immobilized Mez1 or Mez2polypeptide is compared to the immunogenic Mez1 or Mez2 polypeptide. Thepercent cross-reactivity for the above proteins is calculated, usingstandard calculations known to persons skilled in the art.

[0103] The immunoabsorbed and pooled antisera are then used in acompetitive binding immunoassay to compare a second “target” polypeptideto the immunogenic polypeptide. In order to make this comparison, thetwo polypeptides are each assayed at a wide range of concentrations andthe amount of each polypeptide required to inhibit 50% of the binding ofthe antisera to the immobilized protein is determined using standardtechniques. If the amount of the target polypeptide required is lessthan twice the amount of the immunogenic polypeptide that is required,then the target polypeptide is said to specifically bind to an antibodygenerated to the immunogenic protein. As a final determination ofspecificity, the pooled antisera is fully immunoabsorbed with theimmunogenic polypeptide until no binding to the polypeptide used in theimmunoabsorbtion is detectable. The fully immunoabsorbed antisera isthen tested for reactivity with the test polypeptide. If no reactivityis observed, then the test polypeptide is specifically bound by theantisera elicited by the immunogenic protein.

Production of Recombinant Expression Cassettes

[0104] Isolated nucleic acids of the present invention can be used inrecombinant expression cassettes. One of ordinary skill in the art willrecognize that a nucleic acid used in the recombinant expressioncassettes described herein encoding a functional Mez1 or Mez2polypeptide need not have a sequence identical to the exemplifiednucleic acids disclosed herein and does not need to be full length, solong as the desired functional domain of the Mez1 or Mez2 protein isexpressed.

[0105] A nucleic acid comprising a polynucleotide coding for the desiredfunctional Mez1 or Mez2 polypeptide, for example a cDNA or a genomicsequence encoding a full length protein, can be used to construct arecombinant expression cassette which can be introduced into a desiredplant. An expression cassette will typically comprise the functionalMez1 or Mez2 nucleic acid operably linked in either the sense orantisense direction to transcriptional and translational initiationregulatory sequences which will direct the transcription of the sequencefrom the functional Mez1 or Mez2 nucleic acid in the intended tissuesfor the transformed plant. Examples of transcriptional and translationalinitiation regions that can be used in the recombinant expressioncassette are well known in the art.

[0106] The recombinant expression cassette will contain a promoter whichis used to direct expression of the polynucleotides of the presentinvention in one, more than one, or in all of the tissues of aregenerated plant. For example, a constitutive plant promoter may beemployed which will direct expression of the functional Mez1 or Mez2polypeptide in all tissues of a regenerated plant. Examples ofconstitutive promoters includes, but is not limited to, the cauliflowermosaic virus (hereinafter “CaMV”) 35S transcription initiation region,the NOS promoter, the RUBISCO promoter, the 1′ or 2′-promoter derivedfrom T-DNA of Agrobacterium tumefaciens, etc. The determination of asuitable constitutive plant promoter to be used in the recombinantexpression cassette can readily be determined by persons skilled in theart.

[0107] Alternatively, an inducible plant promoter can be used. Aninducible plant promoter may direct expression of the Mez1 or Mez2nucleic acid in specific tissue or under more precise environmental ordevelopmental control in a regenerated plant. Examples of environmentalconditions that may effect transcription by inducible promoters includepathogen attack, anaerobic conditions, or the presence of light.Examples of inducible promoters include, but are not limited to, theHsp70 promoter (which is inducible by heat stress), the PPDK promoter(which is inducible by light), etc.

[0108] Promoters derived from the Mez1 or Mez2 genes can be used todirect expression. These promoters can also be used to direct expressionof heterologous sequences. The promoters can be used, for example, inrecombinant expression cassettes to drive expression of the Mez1 or Mez2nucleic acids of the present invention or heterologous sequences.

[0109] Such promoters can be identified as follows. The 5′ portions ofthe Mez1 or Mez2 genes described herein are analyzed for sequencescharacteristic of promoter sequences. For instance, promoter sequenceelements include the TATA box consensus sequence (TATAAT), which isusually 20 to 30 base pairs upstream of the transcription start site. Inplants, further upstream from the TATA box, at positions −80 to −100,there is typically a promoter element with a series of adeninessurrounding the trinucleotide G (or T) N G. (See, J. Messing et al., inGenetic Engineering in Plants, pp. 221-227 (Kosage, Meredith andHollaender, eds. 1983)).

[0110] If proper polypeptide expression is desired, a polyadenylationregion at the 3′-end of the Mez1 or Mez2 polynucleotide coding regionshould be included. The polyadenylation region can be derived from anatural gene, from a variety of other plant genes, or from T-DNA. Forexample, polyadenylation regions can be derived from the nopalinesynthase or octopine synthase genes.

[0111] The expression cassette comprising the Mez1 or Mez2 nucleic acidswill typically comprise one or more marker genes which confers aselectable phenotype on plant cells. For example, the marker gene canencode biocide resistance, particularly antibiotic resistance, such asresistance to kanamycin, G418, bleomycin, hygromycin, or herbicideresistance, such as resistance to chlorosulforon.

[0112] As discussed briefly above, the Mez1 or Mez2 nucleic acids can beinserted into a recombinant expression cassette in the antisensedirection. Expression of the Mez1 or Mez2 nucleic acid in antisensedirection will result in the production of antisense RNA. It is wellknown to persons skilled in the art that a cell manufactures protein bytranscribing the DNA of the gene encoding a protein to produce RNA,which is then processed to messenger RNA (hereinafter “mRNA”) (e.g., bythe removal of introns) and finally translated by ribosomes intoprotein. This process may be inhibited in the cell by the presence ofantisense RNA. It is believed that this inhibition takes place byformation of a complex between the two complementary strands of RNA,thus preventing the formation of protein. It is presently unclear howthis mechanism works. However, it is believed that the complex mayinterfere with further translation, degrade the mRNA, or have more thanone of these effects. This antisense RNA can be produced in the cell bytransformation of the cell with an appropriate recombinant expressioncassette designed to transcribe the non-template strand (as opposed tothe template strand) of the relevant gene (or of a nucleic acid sequenceshowing substantial identity therewith).

[0113] The use of antisense RNA to downregulate the expression ofspecific plant genes is well known. Reduction in gene expression hasbeen determined to led to changes in the phenotype of a plant, either atthe level of gross visible phenotypic difference (see van der Krol etal., Nature, 333:866-869 (1988)), or at a more subtle biochemical level(Smith et al., Nature, 334:724-726 (1988)). Another method forinhibiting gene expression in transgenic plants involves the use ofsense RNA transcribed from an exogenous template to downregulate theexpression of specific plant genes (See, Jorgensen, Keystone Symposium“Improved Crop and Plant Products through Biotechnology”, AbstractX1-022 (1994)). Thereupon, both antisense and sense RNA can be used toachieve downregulation of gene expression in plants, which areencompassed by the present invention.

Production of Transgenic Plants

[0114] Techniques for transforming a wide variety of higher plantspecies using the recombinant expression cassettes hereinbeforedescribed are well known and described in the technical and scientificliterature (See, for example, Weising et al., Ann. Rev. Genet.,22:421-477 (1988)).

[0115] The hereinbefore described recombinant expression cassettes canbe introduced into the genome of a desired plant host by a variety ofconventional techniques which are well known to persons skilled in theart. For example, the recombinant expression cassette can be introduceddirectly into the genomic DNA of the plant cell using techniques such aselectroporation, PEG poration, particle bombardment, silicon fiberdelivery, and microinjection of plant cell protoplasts or embryogeniccallus, or the expression cassettes can be introduced directly to planttissue using ballistic methods, such as DNA particle bombardment.Alternatively, the expression cassettes may be combined with suitableT-DNA flanking regions and introduced into a conventional Agrobacteriumtumefaciens or Agrobacterium rhizogenes host vector. The virulencefunctions of the Agrobacterium host will direct the insertion of theexpression cassette and adjacent marker gene into the plant cell DNAwhen the cell is infected by the bacteria.

[0116] Plants which can be transformed with the recombinant expressioncassette of the present invention include, but are not limited to, Zeamays L., Oryza sativa, Secale cereale, Triticum aestivum, Daucus carota,Brassica oleracea, Cucumis melo, Cucumis sativus, Latuca sativa, Solanumtubersoum, Lycopersicon esculentum, Phaseolus vulgaris, Brassica napus,etc.

[0117] Transformation techniques are well known to persons skilled inthe art. For example, the introduction of expression cassettes usingpolyethylene glycol precipitation is described in Paszkowski et al.,EMBO J., 3:2712-2722 (1984). Electroporation techniques are described inFromm et al., Proc. Natl. Acad. Sci. USA, 82:5824 (1985). Biolistictransformation techniques are described in Klein et al., Nature,327:70-73 (1987).

[0118]Agrobacterium tumefaciens-mediated transformation techniques arewell known to persons skilled in the art (See, for example Horsch etal., Science 233:496-498 (1984), and Fraley et al., Proc. Natl. Acad.Sci. USA, 80:4803 (1983)). Although Agrobacterium is useful primarily indicots, certain monocots can be transformed by Agrobacterium. U.S. Pat.No. 5,550,318 describes Agrobacterium transformation of maize.

[0119] Moreover, the following methods of transfection or transformationcan also be used: (a) Agrobacterium rhizogenes-mediated transformation(See, Lichtenstein and Fuller In Genetic Engineering, vol. 6, PWJ Rigby,Ed., London, Academic Press, (1987)); (b) liposome-mediated DNA uptake(See, Freeman et al., Plant Cell Physiol., 25:1353 (1984)); and (3) thevortexing method (See, Kindle, Proc. Natl. Acad. Sci. USA, 87:1228(1990)).

[0120] Transformed plant cells which are derived by any of the abovetransformation techniques can be cultured to regenerate a whole plantwhich possesses the transformed genotype. Such regeneration techniquesrely on manipulation of certain phytohormones in a tissue culture growthmedium, typically relying on a biocide and/or herbicide marker which hasbeen introduced together with the Mez1 or Mez2 nucleic acid. Plantregeneration from cultured protoplasts is described in Evans et al.,Protoplasts Isolation and Culture, Handbook of Plant Cell Culture, pp.124-176, MacMillian Publishing Company, New York, 1983; and Binding;Regeneration of Plants, Plant Protoplasts, pp. 21-73, CRC Press, BocaRaton, 1985. Regeneration can also be obtained from plant callus,explants, organs, or parts thereof. Such regeneration techniques aredescribed generally in Klee et al., Ann. Ref. of Plant Phys. 38:467-486(1987).

[0121] One of ordinary skill in the art will recognize that after theexpression cassette is stably incorporated in transgenic plants andconfirmed to be operable, it can be introduced into other plants bysexual crossing. Any of a number of standard breeding techniques can beused, depending upon the species to be crossed.

[0122] Transgenic plants containing the expression cassettes describedherein can be identified by using restriction enzymes or HighPerformance Liquid Chromatography. Techniques for restriction enzymesand High Performance Liquid Chromatography are well known to personsskilled the art. Transgenic plants containing the expression cassettesdescribed herein can be identified by using a Northern Blot analysiswhich is well known to persons skilled in the art.

Synthetic Polypeptides and Purification of Polypeptides

[0123] In addition to being produced recombinantly, the polypeptides ofthe present invention can also be produced synthetically, usingtechniques known in the art. For example, polypeptides having a lengthof about 50 amino acids can be synthesized using solid phase synthesistechniques, such as those described by Barany and Merrifield,Solid-Phase Peptide Synthesis, pp. 3-284 in The Peptides. Analysis,Synthesis, Biology. Vol. 2: Special Methods in Peptide Synthesis, PartA.; Merrifield et al., J. Am. Chem. Soc. 85:2149-2156 (1963).Polypeptides having a length greater than about 50 amino acids can besynthesized by condensation of the amino and carboxy termini of shorterfragments, a technique which is well known to persons skilled in theart.

[0124] Polypeptides of the present invention produced eitherrecombinantly or synthetically, can be purified using standardtechniques known to those persons skilled in the art, including, but notlimited to, column chromatography, selective precipitation with ammoniumsulfate, affinity chromatography, etc.

Methods for Repressing the Expression or Inhibiting the Repression ofExpression of a Target Gene In Vivo

[0125] The Mez1 and Mez2 proteins belongs to the E(z) group of Polycombproteins. As discussed previously, it is known in the art that the escand esc-like (homologs) proteins interact with the E(z) and E(z)-likeproteins in vivo to form complexes. The E(z) and esc proteins interactwith each other, but are not known to physically interact with any othercharacterized PcG proteins. While C. elegans and plants contain homologsof the proteins in the E(z)/esc complex, they do not contain the PRC1complex. The E(z)/esc complex has been found to repress the expressionof a gene during a specific developmental stage and in a specific tissuein plants and C. elegans which lack the PRC1 complex (see Goodrich etal, Nature, 386(6620):44-51 (1997), Holdeman et al., Development,125(13):2457-67 (1998), Korf et al., Development, 125(13):2469-78(1998), Kelly and Fire, Development, 125(13):2451-6 (1998)).

[0126] The Mez1 and Mez2 nucleic acids and proteins of the presentinvention can be used for a number of useful purposes. First, the Mez1and/or Mez2 proteins can be used in a method to repress the expressionof a desired target gene in specific tissue in a plant in vivo. The genetargeted for silencing would either be in cells expressing endogenous orintroduced Mez1 and/or Mez2 and ZmFIE proteins. The ZmFIE2 protein is anesc-like protein isolated from Zea mays L. and is described in copendingapplication U.S. Ser. No. 09/___ filed on Jul. 16, 2001 and entitled,“Polycomb Gene from Maize—ZmFIE2”, hereby incorporated by reference. TheMez1 and/or Mez2 nucleic acids and ZmFIE2 nucleic acids could beconstitutively expressed in these cells or introduced into a plantcontaining the cells by crossing. The gene targeted for silencing mayhave any of a number of different promoters, but would also contain DNAsequence motifs or contexts to which the Mez1 and/or Mez2 /ZmFIE2complex is targeted. This would allow silencing of a gene in specifictissues or at specific times in development. For example, immature rootscontain a non-functional Mez2 protein, but a functional ZmFIE2 protein.Therefore, these cells would not silence an introduced or endogenousgene containing DNA sequences which attract the Mez2/ZmFIE2 complex.Alternatively, developing leaf tissues contain a functional Mez2 andZmFIE2 protein. Therefore, an introduced or endogenous gene containingDNA sequences which attract the Mez2/ZmFIE2 complex would be silenced.

[0127] Alternatively, the Mez1 and Mez2 proteins of the presentinvention can be used in a method to prevent the repression of aparticular desired target gene in vivo in a plant. One mechanism bywhich this could be accomplished is by producing dominant negativemutant forms of said Mez1 and Mez2 protein which fail to form a complexwith any esc or esc-like proteins. In this approach, the recombinantexpression cassette encodes a mutant Mez1 and/or Mez2 polypeptide (themutant polypeptide contain various substitutions, deletions, additions,etc.) which fails to bind to any esc or esc-like proteins properly.Thereupon, the complex would not form.

[0128] A second mechanism by which this could be accomplished is throughthe use of antisense RNA. In this approach, recombinant expressioncassettes containing the Mez1 and/or Mez nucleic acids in the antisensedirection can be inserted into a plant. Preferably, the recombinantexpression casettes contain a tissue-specific promoter which will directexpression to the tissues containing the desired target gene ofinterest. The antisense RNA produced by the expression cassette willhybridize with the endogenous mRNA produced from the Mez1 or Mez2 geneswithin the plant, thus preventing the expression of any Mez1 or Mez2protein. Because there will be no Mez1 or Mez2 protein, the complexbetween the Mez1 and/or Mez2 proteins and any esc or esc-like proteinswill fail to form.

[0129] The use of the Mez1 and Mez2 proteins of the present invention torepress the expression or prevent the repression of the expression of atarget gene in specific tissue in a plant in vivo could be used toregulate homeotic gene expression in plants to create novel plantshaving improved agronomic traits (see Goodrich et al, Nature,386(6620):44-51 (1997)).

[0130] The following Examples are offered by way of illustration, notlimitation.

EXAMPLE 1 Cloning and Characterization of the Mez1 and Mez2 Genes

[0131] Cloning of Mez1 and Mez2: Drosophila E(z) (AAC46462) was used ina TBLASTN search of the Pioneer Hi-Bred EST database. Two contigs withsignificant similarity were discovered, and named Maize Ez-like 1 (Mez1)and Maize Ez-like 2 (Mez2). Other contigs containing a SET domain werealso present but displayed more similarity to trithorax than to E(z).The ctsbp19 clone contained the 3′ 801 bp of Mez1. The Mez2 contigoriginating from the cbmfe16 clone contained the 3′ 1144 bp of the Mez2cDNA. To obtain full-length clones and sequence for the 5′ region ofboth genes, Random Amplification of cDNA Ends (RACE) was performed.Additionally the 3′ end of Mez1 and Mez2 were obtained by RACE to verifythe EST sequence. RACE reactions were performed on one-week seedlingMo17 cDNA using the Marathon cDNA kit (Clontech, Palo Alto, Calif.)using Advantage2 polymerase (Clontech, Palo Alto, Calif.). The primersused were as follows: Mez1F1—GGG TGT GGT GAT GGT ACA TTG G (SEQ ID NO:7), Mez1R2—CAG CTT GTC ACC CAT TCT GTA TGC G (SEQ ID NO: 8), Mez2R3—TGCCTC GTC CTT CTT TGA TCC TTC G (SEQ ID NO: 9) and Mez2F3—CTC ACA AGG AAGCAG ACA AAC GCG G (SEQ ID NO: 10). RACE products were gel purified andcloned into pGEM-T Easy (Promega, Madison, Wis.).

[0132] Sequencing: The plasmids were sequenced using BigDye terminatorcycle sequencing on an ABI sequencer (Perkin-Elmer Applied Biosystems).Sequencing reactions were done in a 10 μl volume with 320 ng DNA and 10pg of primer. Primers used were as follows: T7 (Promega), SP6 (Promega),Mez1F1, Mez1F2—TAC CTT GGT GAG TAC ACT GGG GAA C (SEQ ID NO: 11),Mez1F4—CCA TTT CGT GTA TCA GAC CTA AGC (SEQ ID NO: 12), Mez1F5—CAT CAACGC CCT CCA AGC (SEQ ID NO: 13), Mez1R6—TGC CAC ATT CTT GAA CTG TCA TCCG (SEQ ID NO: 14), Mez1R4—GCA CAG TGA CAT CCT CGA AAA CG (SEQ ID NO:15), Mez1R5—GTC CCT GCT CAA TTG CC (SEQ ID NO: 16), Mez2F4—GCG GAC AATTGT GCG GTT CG (SEQ ID NO: 17), Mez2F5—GGT TGT TCA CAG AAT TTG G (SEQ IDNO: 18), Mez2R4—CTT CCT AAC AAA ATC CTT TGC TGT TG (SEQ ID NO: 19) andMez2R5—TTG CTC CAT GTA GTC TTG (SEQ ID NO: 20).

[0133] Sequence analysis: The sequences were assembled through thecontig assembly program (http://gcg.tigem.it/ASSEMBLY/assemble.html).Reverse complement, translation and ClustalW were all accessed from theABCC sequence analysis page (http://biosci.cbs.umn.edu/seqanal/).ClustalW alignments were processed using Boxshade(http://www.ch.embnet.org/software/BOX_form.html). All BLAST searcheswere performed using the NCBI BLAST feature. For some searches theadvanced BLAST feature was used and a target organism was specified.Targeting signals and putative localization were predicted using PSORT(http://psort.nibb.ac.jp/). Domains were identified using SMART(http://smart.embl-heidelberg.de/).

[0134] Phylogenetic analysis: The SET domains from all E(z)-likeproteins were aligned using ClustalW. This alignment was then submittedto the PHYLIP server athttp://bioweb.pasteur.fr/seqanal/phylogeny/phylip-uk.html. The protparsfeature was used with bootstrapping performed before analysis. Onehundred replicates were examined to determine bootstrap values. Theconsensus tree was then displayed with bootstrap values.

[0135] RT-PCR analysis: Total RNA was extracted from tissues includingembryo, leaf, immature ear, immature tassel, 3-day root, pollen and BMS(Black Mexican Sweet) suspension cultures using TRIzol (LifeTechnologies Gibco/BRL). PolyA+RNA, isolated using PolyAtract (Promega)was used to make cDNA with Marathon cDNA Amplification Kit (Clontech). 2ng of cDNA was used in each PCR reaction. The primers used were: Mez1F1,Mez1R1—CGG GAC CTA ACT CTA CGG ATG G (SEQ ID NO: 21), Mez2F6—CGC AGC TGATAC GGC AAG TCC AAT CG (SEQ ID NO: 22) and Mez2R2—GTA TCA TCC GGA GCGACT CTT CAG C (SEQ ID NO: 23). Cycling conditions were as follows:94°2′, 5 cycles of 94° for 30″, 70° for 30″, 72° for 1′, 5 cycles of 94°for 30″, 67.5° for 30″, 72° for 1′, then 25 cycles of 94° for 30″, 65°for 30″, 72° for 1′ followed by 72° for 7′. Each 25 μl reactioncontained 1 μl of a 10 μM primer solution for each primer, 2 ng cDNA,2.5 μl 10× buffer, 2 μl 25 mM MgCl₂, 0.3 μl 25mM dNTP's (Promega), 0.2μl Taq polymerase (Promega) and 17 μl ddH₂O.

[0136] Sequence analysis: The sequences were assembled through thecontig assembly program (http://gcg.tigem.it/ASSEMBLY/assemble.html).Reverse complement, translation and ClustalW were all accessed from theABCC sequence analysis page (http://biosci.cbs.umn.edu/seqanal/).ClustalW alignments were processed using Boxshade(http://www.ch.embnet.org/software/BOX_form.html). All BLAST searcheswere performed using the NCBI BLAST feature. For some searches theadvanced BLAST feature was used and a target organism was specified.Targeting signals and putative localization were predicted using PSORT(http://psort.nibb.acjp/). Domains were identified using SMART(http://smart.embl-heidelberg.de/).

RESULTS

[0137] Mez1 and Mez2:

[0138] Two contigs with significant similarity to the Drosophila E(z)were discovered in the Pioneer Hi-Bred EST database. These contigs werenamed Maize E(z)-like 1 (Mez1) and Maize E(z)-like 2 (Mez2). To test forthe presence of Mez1 ESTs in the public maize database the Mez1 cDNA wasused in a BLASTN search (www.zmdb.iastate.edu). No Mez1 ESTs were found,but two putative trithorax hits were detected due to similarity of theE(z) and trithorax SET domains.

[0139] Mez1 was mapped to the short arm of chromosome 6 (bin 6.01-6.02).The Mez2 sequence was placed to the short arm of chromosome 9 (bin9.04). Mutants with the phenotypes similar to the Arabidopsis clf ormedea have not been mapped to these regions.

[0140] Alignment of Mez1 and Mez2:

[0141] The amino acid sequences of Mez1 and Mez2 were aligned usingClustalW (FIG. 3). The sequences are 42% identical and 56% similar overtheir entire lengths. The nucleotide sequences of Mez1 and Mez2 are 52%identical. In maize, it is common to find two closely related sequencesdue to the ancient tetraploid nature of maize. Often the two sequencesthat arose from the tetraploid fusion display greater than 70%nucleotide identity (Gaut and Doebley, PNAS, U.S.A., 94:6809-6814(1997)). The lower identity of the Mez1 and Mez2 nucleotide sequencesindicates that these genes were probably duplicated prior to theformation of the maize tetraploidy event. In addition the map positionsof these two sequences do not correspond to colinear regions of themaize genome (Helentjaris, T., Maize Newsletter, 69:67-81 (1995)).

[0142] Characteristics of Mez1 and Mez2:

[0143] A putative bipartite nuclear localization signal is found in bothMez1 and Mez2 (See, FIGS. 4 and 5). Mez2 and Mez1 were aligned with theother characterized E(z)-like proteins using ClustalW (FIG. 4).

[0144] There are two regions near the C-terminal of the protein that arewell conserved among all E(z) proteins (FIG. 4a). These are the Cys-richregion and the SET domain. The Cys-rich region has a number of highlyconserved cysteine residues. The spacing of the cysteine residues isunlike other Cys-rich zinc finger domains involved in DNA binding. Thefunction of this domain is not known but it is highly conserved amongall E(z) like genes. Mez1 is 45% identical to E(z) in this region whileMez2 is 46% identical. The SET (Su(var)3-9, Enhancer-of-zeste,Trithorax) domain found at the C-terminal end of the protein is alsohighly conserved. The SET domain of Mez1 is 55% identical to the E(z)SET domain (Mez2 is 54% identical). SET domains appear to be involved inmediating protein-protein interactions (Cui et al., Nat. Genet.,18:331-337 (1998); Huang et al., J. Biol. Chem., 273:15933-15939(1998)). Interestingly, the nonspecific transcriptional activator,trithorax, also contains a SET domain indicating that SET domains aloneare not responsible for transcriptional repression.

[0145] The Mez1 and Mez2 sequences were submitted to the SMART server toidentify other domains within these proteins (Schultz et al., PNAS USA,95:5857-5864 (1998); Schultz et al., Nucl. Acids Res., 28:231-234(2000)). In addition to the SET domain, a SANT (SWI3, ADA2, N-CoR andTFIIIB″ DNA-binding domains) domain was identified (FIGS. 4 and 5). Themyb-DNA binding domain is a SANT domain as well. This indicates thatplant E(z)-like genes have a domain that may facilitate DNA binding. TheSMART program also predicts the presence of a SANT domain in the animalE(z)-like proteins.

[0146] An acidic region is present in E(z) -like proteins near theN-terminal region (FIGS. 4 and 5). The function of this domain is notknown. This acidic region is conserved in all E(z)-like proteins. Asmall region near amino acid 250 of the plant E(z)-like proteins ishighly conserved. This region, named CRRC region, is not recognized bythe SMART program. The CRRC region is composed primarily of polar orcharged residues.

[0147] Evolution of E(z) sequences:

[0148] Arabidopsis contains at least three E(z)-like genes that performdistinct functions. The low degree of nucleotide similarity between Mez1and Mez2 indicates that these genes may have distinct evolutionaryorigins. The SET domain sequences of all E(z)-like proteins were alignedusing ClustalW. This alignment was then processed using PHYLIP and aparsimonious tree was constructed (FIG. 5). The tree shows grouping ofthe Arabidopsis clf and the maize Mez1. When the full-length proteinsequences were used for the alignments, the same tree was produced. Theresults indicate that Mez1 is a clf-like gene in maize while Mez2 islikely EZA1 homolog.

[0149] Alternative splicing of Mez2:

[0150] In an attempt to generate a full length Mez2 clone, PCR primersin the 5′ and 3′ UTR region were used to amplify B73 ear cDNA. Inaddition to a major product of the expected size, two smaller productswere observed (FIG. 6a). These two products were excised and used forPCR reactions with primers from various regions of the gene to detectwhere the difference in size was arising. A region near the middle ofMez2 was identified and the PCR products from the two isoforms, Mez2alternative splice 1 (Mez2^(as1)) and Mez2 alternative splice 2(Mez2^(as2)), were sequenced. Sequencing revealed that the smallerproducts were identical to Mez2 except for the missing 659 base pairs inMez2^(as1) and 810 base pairs in Mez2^(as2). The deleted fragment inMez2^(as1) corresponds to base pairs 1016 to basepairs1676 of Mez2. TheMez2^(as1) deletion will cause a frameshift and a truncated protein of341 amino acids (FIG. 6). The deletion in the Mez2^(as2) corresponds tobasepairs 1016 to basepairs 1827 of Mez2 and does not result in aframeshift. The deletion in Mez2^(as2) results in a 624 amino acidprotein that is missing the SANT domain. It is possible that thepresence of multiple products in these PCR reactions is due to secondarystructure of the RNA or aberrant PCR products. The presence of theproducts displaying identical size shifts in PCR reactions usingmultiple primers sets makes it unlikely that these are the result ofmispriming events. No significant secondary structure was identified inthese regions using secondary structure prediction programs. Together,these findings indicate that the presence of multiple products is mostlikely due to alternative splicing of Mez2 mRNA.

[0151] Expression of Mez1 and Mez2:

[0152] cDNA from various maize tissues was tested for the presence ofMez1 and Mez2 transcripts. Abundant Mez1 transcripts were detected inembryo, ear and root tissues (FIG. 7a). Transcripts were also present inleaf, BMS cell culture, and pollen tissues. There were no tissues testedthat did not contain Mez1 transcripts.

[0153] The same tissues were tested for the presence of Mez2 transcripts(FIG. 7b). The primers used to test for Mez2 expression flank the siteof alternative splicing documented in cDNA ear tissue. Amplificationfrom ear cDNA revealed the presence of the three transcripts observedpreviously. In the lane amplified from embryo cDNA, a doublet ofMez2^(as2) and a smaller fragment is observed. The sequence of thissmaller fragment has not been analyzed. No Mez2 or Mez2^(as1)transcripts are observed in embryo tissue. Mez2 transcripts are thepredominant form in leaf tissue, with very faint Mez2^(as1) andMez2^(as2) products. An intense Mez2 product is amplified from immaturetassel cDNA. In addition, a Mez₂ ^(as2) and two uncharacterized productsare present. Only Mez2^(as1) transcripts are detected in 3-day rootcDNA. Faint Mez2 and Mez2^(as2) products are observed from the BMS cellculture cDNA.

[0154] Mutator insertions into Mez2:

[0155] The Mez1 and Mez2 sequences were submitted to the Pioneer Hi-BredInt'l TUSC system. The TUSC system is designed to find Mutator (Mu)insertions in a sequence of interest. Difficulties were encountered indesigning primers to amplify the Mez1 sequence, Mez2 primers weredesigned and used to screen the DNA pools. Four independent insertionswere found. The location of the four Mu insertions and five of the Mez2introns are shown in FIG. 2a. Mez2-Mu1 is an intron insertion whileMez2-Mu2, Mez2-Mu3 and Mez2-Mu4 are all exon insertions.

[0156] All references, patents and patent applications referred toherein are hereby incorporated by reference.

[0157] The present invention is illustrated by way of the foregoingdescription and examples. The foregoing description is intended as anon-limiting illustration, since many variations will become apparent tothose skilled in the art in view thereof. It is intended that all suchvariations within the scope and spirit of the appended claims beembraced thereby.

[0158] Changes can be made to the composition, operation and arrangementof the method of the present invention described herein withoutdeparting from the concept and scope of the invention as defined in thefollowing claims.

1 20 1 3180 DNA Zea mays 1 cgcgtgtgag ggcgggagag cgcgcggggc tagggtttccgcgggtgatg gaagcagagg 60 ctgccgcggc ggtagtggcg tcgtccgcat ctgcctcggcttccgcgggc cggtctcgcc 120 catctagcag cgccgcctcg gtcaccagta attcggctgtgcgagctgga gaagaaaatg 180 ctgcctccct ctatgtttta tctgttattg actcgttaaaaaagaggatt accgcagatc 240 gtttgactta cattaagaat aggatagggg agaacaagactaatatcagc agctatacac 300 agaggactta caatttatca aaaaataggc aaattagtacatcaaagggt actgattcag 360 catcaaattt gctcacaaaa aggcaagatg atgcgctatgcaccctgcat agtcttgata 420 ttattccggt tgacaaagat ggtggcactt ttcaagacgaaagtcctttc tcttcatcta 480 atgttatgtt tggtggaaat cttggtccca agaatgctattattagacca attaaactac 540 cagaagtgcc aaagcttcca ccttatacaa catggatatttttggacagg aaccaaagga 600 tgacagaaga ccaatctgta cttggtcgac ggaggatttactatgatacc agttgtggtg 660 aagctctaat ttgcagtgat agtgaagatg aagccattgaagatgaggag gaaaaaaagg 720 aatttaaaca ttctgaagat cacattattc ggatgacagttcaagaatgt ggcatgtctg 780 atgctgtact gcaaacgcta gctcgacaca tggagcgggctgctgatgac ataaaggcca 840 ggtatgaaat tctgcatggt gagaaaacta aggattcttgcaagaaaggg actgagcata 900 atgtcaaagt ggaagatttg tactgtgaca aagatttggatgcagcattg gattcttttg 960 acaatctctt ctgtcgacga tgtctagtgt ttgattgcaagctacatggg tgttctcaag 1020 atttagtatt tcctccagaa aaacaaccag cttggaggggcgttgatgac agtgtaccct 1080 gtggtattca ttcccataaa ctggcatctg aaccagattctgctgctggt gctgatccca 1140 tgctttttga tgttgaggag ccaactcact catcagacaatgtgatgaac cagccaggtt 1200 caaataggaa aaagaacggc tccagtggaa ggaagactaaatctcaacaa agtgaaagct 1260 cttcaactgc aagagttatc tcagaaagca gtgcttcggaagtacatcca ataagcaata 1320 aatctccaca acactcccct agtccctcaa aagttaaaattgggccaaaa ggtggaatca 1380 gaaagattac caatagacga atcgctgaga gaattcttatgagtgtgaag aaaggacaaa 1440 gggaaatggc atcatctgat tctaattttg ttagtggatatcttttggca agggacatga 1500 agcttaggtc tgatacacga aatggaaata aggaattaattgtatcctca caacagagtt 1560 ctccaagcac aagaagttcc aaaaagaaga gtacacctcaaattgggaac agctcagctt 1620 ttgctgaggc tcataatgat tcaacagagg aagcaaataaccgtcattca gcaacagatg 1680 gttacgatag ttcaaggaaa gaagaattcg tcaatgagaatttatgcaag caggaggtgt 1740 acttgagatc atggaaggca attgagcagg gacttcttgtgaaaggatta gagatttttg 1800 gaaggaacag ttgtttaatt gctcggaacc ttcttggtggaatgaagacg tgcaaagatg 1860 tttttcaata tatgaattat attgaaaaca acagtgcctctggagctctt agtggtgttg 1920 attctcttgt caaaggatat attaagggta ctgagttgcgcacaagatca agatatttta 1980 gaaggcgagg taaagtccgt cgtttgaagt acacctggaaatctgcaggt tacaatttca 2040 aaaggattac cgaaaggaag gatcagcctt gtcgacaatataatccttgt ggttgtcaat 2100 ctacatgcgg aaagcagtgt ccatgtcttt caaatgggacatgttgtgag aaatactgtg 2160 ggtgtccaaa aatttgcaag aatcgttttc gaggatgtcacttgtgcaag agccagtgtc 2220 gcagccgcca atgtccatgt tttgcagctg acagggaatgcgatccggat gtttgcagaa 2280 actgttgggt tgggtgtggt gatggtacat tgggagttccaaaccagaga ggagataatt 2340 atgaatgccg gaacatgaaa ctgcttctta aacaacaacaaagggtctta cttggaagat 2400 cagatgtctc tggctgggga gcattcctca agaatagtgttagcaaacat gaataccttg 2460 gtgagtacac tggggaacta atctcacaca aagaagcagataagcgtgga aagatatatg 2520 atcgtgagaa ctcatcgttc cttttcaacc tgaacaacgagtatgttctt gacgcataca 2580 gaatgggtga caagctgaaa tttgccaacc atgcccctgacccgaattgc tatgccaagg 2640 ttatcatggt aactggtgat catagagtgg gcatattcgccaaagaaaga atcctcgctg 2700 gtgaagagtt attctacgat taccgctatg agcctgacagagcccctgct tgggcccgta 2760 agcctgaggc gtcgggagca aaggatgatg ggcaaccgttcaatgggcgt gcaaagaagc 2820 tcgcccaaaa caacagaggc tgaatctgat ttgattctttcattgttagg acaaatttgg 2880 cagccattca actaatataa ggaacctgtc attcataggccccaatttat ttgaactcgt 2940 cattgtaact cgtatgtgct tgaattctcc atggcagctggtcctgccat ccgtagagtt 3000 aggtcccgtt tgttttgagg aactaaaaat taatccctctattttagtca cattgagtct 3060 tagattgtta aacggcggga ctaaaacaaa agactaaactatttgtctct agtacctcaa 3120 gccatgacta aaagggaata aatcatataa attttatttttatccttcct ttaaaaaaaa 3180 2 931 PRT Zea mays 2 Met Glu Ala Glu Ala AlaAla Ala Val Val Ala Ser Ser Ala Ser Ala 1 5 10 15 Ser Ala Ser Ala GlyArg Ser Arg Pro Ser Ser Ser Ala Ala Ser Val 20 25 30 Thr Ser Asn Ser AlaVal Arg Ala Gly Glu Glu Asn Ala Ala Ser Leu 35 40 45 Tyr Val Leu Ser ValIle Asp Ser Leu Lys Lys Arg Ile Thr Ala Asp 50 55 60 Arg Leu Thr Tyr IleLys Asn Arg Ile Gly Glu Asn Lys Thr Asn Ile 65 70 75 80 Ser Ser Tyr ThrGln Arg Thr Tyr Asn Leu Ser Lys Asn Arg Gln Ile 85 90 95 Ser Thr Ser LysGly Thr Asp Ser Ala Ser Asn Leu Leu Thr Lys Arg 100 105 110 Gln Asp AspAla Leu Cys Thr Leu His Ser Leu Asp Ile Ile Pro Val 115 120 125 Asp LysAsp Gly Gly Thr Phe Gln Asp Glu Ser Pro Phe Ser Ser Ser 130 135 140 AsnVal Met Phe Gly Gly Asn Leu Gly Pro Lys Asn Ala Ile Ile Arg 145 150 155160 Pro Ile Lys Leu Pro Glu Val Pro Lys Leu Pro Pro Tyr Thr Thr Trp 165170 175 Ile Phe Leu Asp Arg Asn Gln Arg Met Thr Glu Asp Gln Ser Val Leu180 185 190 Gly Arg Arg Arg Ile Tyr Tyr Asp Thr Ser Cys Gly Glu Ala LeuIle 195 200 205 Cys Ser Asp Ser Glu Asp Glu Ala Ile Glu Asp Glu Glu GluLys Lys 210 215 220 Glu Phe Lys His Ser Glu Asp His Ile Ile Arg Met ThrVal Gln Glu 225 230 235 240 Cys Gly Met Ser Asp Ala Val Leu Gln Thr LeuAla Arg His Met Glu 245 250 255 Arg Ala Ala Asp Asp Ile Lys Ala Arg TyrGlu Ile Leu His Gly Glu 260 265 270 Lys Thr Lys Asp Ser Cys Lys Lys GlyThr Glu His Asn Val Lys Val 275 280 285 Glu Asp Leu Tyr Cys Asp Lys AspLeu Asp Ala Ala Leu Asp Ser Phe 290 295 300 Asp Asn Leu Phe Cys Arg ArgCys Leu Val Phe Asp Cys Lys Leu His 305 310 315 320 Gly Cys Ser Gln AspLeu Val Phe Pro Pro Glu Lys Gln Pro Ala Trp 325 330 335 Arg Gly Val AspAsp Ser Val Pro Cys Gly Ile His Ser His Lys Leu 340 345 350 Ala Ser GluPro Asp Ser Ala Ala Gly Ala Asp Pro Met Leu Phe Asp 355 360 365 Val GluGlu Pro Thr His Ser Ser Asp Asn Val Met Asn Gln Pro Gly 370 375 380 SerAsn Arg Lys Lys Asn Gly Ser Ser Gly Arg Lys Thr Lys Ser Gln 385 390 395400 Gln Ser Glu Ser Ser Ser Thr Ala Arg Val Ile Ser Glu Ser Ser Ala 405410 415 Ser Glu Val His Pro Ile Ser Asn Lys Ser Pro Gln His Ser Pro Ser420 425 430 Pro Ser Lys Val Lys Ile Gly Pro Lys Gly Gly Ile Arg Lys IleThr 435 440 445 Asn Arg Arg Ile Ala Glu Arg Ile Leu Met Ser Val Lys LysGly Gln 450 455 460 Arg Glu Met Ala Ser Ser Asp Ser Asn Phe Val Ser GlyTyr Leu Leu 465 470 475 480 Ala Arg Asp Met Lys Leu Arg Ser Asp Thr ArgAsn Gly Asn Lys Glu 485 490 495 Leu Ile Val Ser Ser Gln Gln Ser Ser ProSer Thr Arg Ser Ser Lys 500 505 510 Lys Lys Ser Thr Pro Gln Ile Gly AsnSer Ser Ala Phe Ala Glu Ala 515 520 525 His Asn Asp Ser Thr Glu Glu AlaAsn Asn Arg His Ser Ala Thr Asp 530 535 540 Gly Tyr Asp Ser Ser Arg LysGlu Glu Phe Val Asn Glu Asn Leu Cys 545 550 555 560 Lys Gln Glu Val TyrLeu Arg Ser Trp Lys Ala Ile Glu Gln Gly Leu 565 570 575 Leu Val Lys GlyLeu Glu Ile Phe Gly Arg Asn Ser Cys Leu Ile Ala 580 585 590 Arg Asn LeuLeu Gly Gly Met Lys Thr Cys Lys Asp Val Phe Gln Tyr 595 600 605 Met AsnTyr Ile Glu Asn Asn Ser Ala Ser Gly Ala Leu Ser Gly Val 610 615 620 AspSer Leu Val Lys Gly Tyr Ile Lys Gly Thr Glu Leu Arg Thr Arg 625 630 635640 Ser Arg Tyr Phe Arg Arg Arg Gly Lys Val Arg Arg Leu Lys Tyr Thr 645650 655 Trp Lys Ser Ala Gly Tyr Asn Phe Lys Arg Ile Thr Glu Arg Lys Asp660 665 670 Gln Pro Cys Arg Gln Tyr Asn Pro Cys Gly Cys Gln Ser Thr CysGly 675 680 685 Lys Gln Cys Pro Cys Leu Ser Asn Gly Thr Cys Cys Glu LysTyr Cys 690 695 700 Gly Cys Pro Lys Ile Cys Lys Asn Arg Phe Arg Gly CysHis Leu Cys 705 710 715 720 Lys Ser Gln Cys Arg Ser Arg Gln Cys Pro CysPhe Ala Ala Asp Arg 725 730 735 Glu Cys Asp Pro Asp Val Cys Arg Asn CysTrp Val Gly Cys Gly Asp 740 745 750 Gly Thr Leu Gly Val Pro Asn Gln ArgGly Asp Asn Tyr Glu Cys Arg 755 760 765 Asn Met Lys Leu Leu Leu Lys GlnGln Gln Arg Val Leu Leu Gly Arg 770 775 780 Ser Asp Val Ser Gly Trp GlyAla Phe Leu Lys Asn Ser Val Ser Lys 785 790 795 800 His Glu Tyr Leu GlyGlu Tyr Thr Gly Glu Leu Ile Ser His Lys Glu 805 810 815 Ala Asp Lys ArgGly Lys Ile Tyr Asp Arg Glu Asn Ser Ser Phe Leu 820 825 830 Phe Asn LeuAsn Asn Glu Tyr Val Leu Asp Ala Tyr Arg Met Gly Asp 835 840 845 Lys LeuLys Phe Ala Asn His Ala Pro Asp Pro Asn Cys Tyr Ala Lys 850 855 860 ValIle Met Val Thr Gly Asp His Arg Val Gly Ile Phe Ala Lys Glu 865 870 875880 Arg Ile Leu Ala Gly Glu Glu Leu Phe Tyr Asp Tyr Arg Tyr Glu Pro 885890 895 Asp Arg Ala Pro Ala Trp Ala Arg Lys Pro Glu Ala Ser Gly Ala Lys900 905 910 Asp Asp Gly Gln Pro Phe Asn Gly Arg Ala Lys Lys Leu Ala GlnAsn 915 920 925 Asn Arg Gly 930 3 3030 DNA Zea mays 3 ccgtcgcagaattcgcgcca ccgcccgcga tggcttcgtc ctcgaaggcc tccgattcct 60 cccaacgatccaagcggtcg gatcagggga tgggcaagga cgccgctgcc gcctctgttg 120 tcccgatccacgcgaacctg acgcagctga tacggcaagt ccaatcgggg cgcctcgcgt 180 acatcaaggagaaattggag gtgaacagga aaacgctgca gaggcactcc tgctcgctgt 240 tcgacgtggcagcggcggcg gaggtggcgt cgaggggcac cgatggcggc aacgcgctgt 300 cacagcgcgcggcggagaga cagtgtgggt cagacctggc aaacgggata ggggagaggg 360 atgtggtttccgttcacgag gagaacctgg ctaccggtac gctcgcgctc tccagctcgg 420 gcgctaccgcgcagcggaca attgtgcggt tcgtgaagct gccgctggtt gagaagatcc 480 ctccgtacaccacttggatc ttcctggaca aaaaccaaag aatggctgac gatcagtcag 540 ttgttggtaggagaaggata tactatgata cagttggaaa cgaggctctg atctgcagtg 600 acagtgatgaagaaattcca gaaccagagg aagagaaaca ctttttcaca aagggagaag 660 atcatttgatatggagagct actcaagacc atgggttaaa ccaagaggtt gttaatgtcc 720 tttgccagtttattggtgca accccatcag aaattgagga aagatctgaa gtcctatttg 780 agaaaaatgagaagcactca ggatcttcag ataagataga gagccgactt tctcttgaca 840 aaactatggatgccgttctg gattcttttg ataatctctt ctgccgcaga tgcttggttt 900 ttgattgccgccttcatggt tgttcacaga atttggtatt tccatgtgag aagcaaccct 960 acagctttgaccctgatgaa aacaagaagc catgtggtca tttgtgctac cttcgatttc 1020 cccagtggagagaaggattt aaagagatgc atgatgatgg tcttgctggt ggtgcaacat 1080 atactatggaatcgggaact gcctcacaga gagttgatgt taatgttatg tatgaatcag 1140 aagattcaaaccgacagaaa ggcaacatta ggtccatgac actagttgga accagtggac 1200 caaaaataatttcttctgtc agtgcggaag aaagcactac tactccagca gatatctctg 1260 aaacagagaatgtatcctct gatttgcctc ccagtagttt aaggaaacac aagatttcaa 1320 aacatggacctaggtacagg gagcattctc ctggcaaaag gcagaaggtt ttcacttctg 1380 acatttcttttgaaggcagt ataatgaata aactttccat tccggagatt cgtgacacaa 1440 gactagagtccagagaatct ggtggtgata aactacgaat tcttgacgag tccactaaga 1500 agacttcaaggaaagatatg tgtggggaaa gcccagctac taccatggaa aatgtgggaa 1560 gacagagtaataaagtgtat tcaacaaaga atttcttgga gtccactctt tcttgttgga 1620 gtgccttagagcgagatcta tacttgaagg gcatagagat atttggaaag aacagctgtc 1680 tcatcgccagaaacttacta tctggtctta agacctgcat agaagtggca aactacatgt 1740 ataacaatggtgcagcgatg gcgaagagac ctctcttgaa taaatccatc tcaggcgact 1800 ttgcagaaaatgaacaagac tacatggagc aagacatggc tgccagaaca agaatctatc 1860 gtcggaggggccgcaatcga aagctgaaat atacttggaa atctgcaggg catccaactg 1920 ttagaaaaagaactgatgac gggaagcaat gttacacaca atatagccca tgtgcgtgcc 1980 agcaaatgtgtggtaaagat tgcccctgtg cggacaaggg aacttgctgt gagaagtact 2040 gtgggtgttcgaagagctgc aaaaacaagt ttagaggctg tcattgtgca aaaagccaat 2100 gcagaagcagacagtgcccc tgttttgcag ccagtcgtga atgtgatcca gatgtttgta 2160 ggaattgctgggtgagctgt ggagatggtt cactaggtga gccacttgca agaggtgatg 2220 gttatcagtgtggaaatatg aagctccttt tgaaacaaca gcaaaggatt ttgttaggaa 2280 gatctgatgttgcaggttgg ggtgcattca ttaagaatcc tgtaaataaa aatgattatc 2340 ttggagaatatactggtgaa ttgatctctc acaaggaagc agacaaacgc ggcaaaattt 2400 atgaccgggcaaactcatct tttctgttcg atttaaatga ccagtatgtg ttggatgctt 2460 atcgcaagggggacaaattg aagttcgcaa atcactcatc taaccccaac tgctatgcaa 2520 aggtgatgctggtggctggc gaccatcggg ttggtatata tgcgaaggag catattgagg 2580 ctagcgaggaactcttttat gattatcgtt atggacctga ccaggctccg gcttgggcta 2640 ggagacccgaaggatcaaag aaggacgagg catccttctc tcaccgtcga gcacacaaag 2700 tggctcgatagctgaagagt cgctccggat gatacaatat gcagtaaact taatacttaa 2760 tacatgattcagtcctagtt cattggtaga taaacatgct atatactatc cattagtaaa 2820 taaactctcattcatcgagt tggagaataa atgcgtataa acatatgtgg acctcaggtc 2880 gggaaggtggcaaccttgtt agtttgagca ccaacaggtt ctcaaacttg agtggctatt 2940 gctagagtatcaaataatgg ctgcgactat agccttgttt gtatattttc ttggtgagat 3000 gaaataatttgtcaaatgta cacttaaaaa 3030 4 893 PRT Zea mays 4 Met Ala Ser Ser Ser LysAla Ser Asp Ser Ser Gln Arg Ser Lys Arg 1 5 10 15 Ser Asp Gln Gly MetGly Lys Asp Ala Ala Ala Ala Ser Val Val Pro 20 25 30 Ile His Ala Asn LeuThr Gln Leu Ile Arg Gln Val Gln Ser Gly Arg 35 40 45 Leu Ala Tyr Ile LysGlu Lys Leu Glu Val Asn Arg Lys Thr Leu Gln 50 55 60 Arg His Ser Cys SerLeu Phe Asp Val Ala Ala Ala Ala Glu Val Ala 65 70 75 80 Ser Arg Gly ThrAsp Gly Gly Asn Ala Leu Ser Gln Arg Ala Ala Glu 85 90 95 Arg Gln Cys GlySer Asp Leu Ala Asn Gly Ile Gly Glu Arg Asp Val 100 105 110 Val Ser ValHis Glu Glu Asn Leu Ala Thr Gly Thr Leu Ala Leu Ser 115 120 125 Ser SerGly Ala Thr Ala Gln Arg Thr Ile Val Arg Phe Val Lys Leu 130 135 140 ProLeu Val Glu Lys Ile Pro Pro Tyr Thr Thr Trp Ile Phe Leu Asp 145 150 155160 Lys Asn Gln Arg Met Ala Asp Asp Gln Ser Val Val Gly Arg Arg Arg 165170 175 Ile Tyr Tyr Asp Thr Val Gly Asn Glu Ala Leu Ile Cys Ser Asp Ser180 185 190 Asp Glu Glu Ile Pro Glu Pro Glu Glu Glu Lys His Phe Phe ThrLys 195 200 205 Gly Glu Asp His Leu Ile Trp Arg Ala Thr Gln Asp His GlyLeu Asn 210 215 220 Gln Glu Val Val Asn Val Leu Cys Gln Phe Ile Gly AlaThr Pro Ser 225 230 235 240 Glu Ile Glu Glu Arg Ser Glu Val Leu Phe GluLys Asn Glu Lys His 245 250 255 Ser Gly Ser Ser Asp Lys Ile Glu Ser ArgLeu Ser Leu Asp Lys Thr 260 265 270 Met Asp Ala Val Leu Asp Ser Phe AspAsn Leu Phe Cys Arg Arg Cys 275 280 285 Leu Val Phe Asp Cys Arg Leu HisGly Cys Ser Gln Asn Leu Val Phe 290 295 300 Pro Cys Glu Lys Gln Pro TyrSer Phe Asp Pro Asp Glu Asn Lys Lys 305 310 315 320 Pro Cys Gly His LeuCys Tyr Leu Arg Phe Pro Gln Trp Arg Glu Gly 325 330 335 Phe Lys Glu MetHis Asp Asp Gly Leu Ala Gly Gly Ala Thr Tyr Thr 340 345 350 Met Glu SerGly Thr Ala Ser Gln Arg Val Asp Val Asn Val Met Tyr 355 360 365 Glu SerGlu Asp Ser Asn Arg Gln Lys Gly Asn Ile Arg Ser Met Thr 370 375 380 LeuVal Gly Thr Ser Gly Pro Lys Ile Ile Ser Ser Val Ser Ala Glu 385 390 395400 Glu Ser Thr Thr Thr Pro Ala Asp Ile Ser Glu Thr Glu Asn Val Ser 405410 415 Ser Asp Leu Pro Pro Ser Ser Leu Arg Lys His Lys Ile Ser Lys His420 425 430 Gly Pro Arg Tyr Arg Glu His Ser Pro Gly Lys Arg Gln Lys ValPhe 435 440 445 Thr Ser Asp Ile Ser Phe Glu Gly Ser Ile Met Asn Lys LeuSer Ile 450 455 460 Pro Glu Ile Arg Asp Thr Arg Leu Glu Ser Arg Glu SerGly Gly Asp 465 470 475 480 Lys Leu Arg Ile Leu Asp Glu Ser Thr Lys LysThr Ser Arg Lys Asp 485 490 495 Met Cys Gly Glu Ser Pro Ala Thr Thr MetGlu Asn Val Gly Arg Gln 500 505 510 Ser Asn Lys Val Tyr Ser Thr Lys AsnPhe Leu Glu Ser Thr Leu Ser 515 520 525 Cys Trp Ser Ala Leu Glu Arg AspLeu Tyr Leu Lys Gly Ile Glu Ile 530 535 540 Phe Gly Lys Asn Ser Cys LeuIle Ala Arg Asn Leu Leu Ser Gly Leu 545 550 555 560 Lys Thr Cys Ile GluVal Ala Asn Tyr Met Tyr Asn Asn Gly Ala Ala 565 570 575 Met Ala Lys ArgPro Leu Leu Asn Lys Ser Ile Ser Gly Asp Phe Ala 580 585 590 Glu Asn GluGln Asp Tyr Met Glu Gln Asp Met Ala Ala Arg Thr Arg 595 600 605 Ile TyrArg Arg Arg Gly Arg Asn Arg Lys Leu Lys Tyr Thr Trp Lys 610 615 620 SerAla Gly His Pro Thr Val Arg Lys Arg Thr Asp Asp Gly Lys Gln 625 630 635640 Cys Tyr Thr Gln Tyr Ser Pro Cys Ala Cys Gln Gln Met Cys Gly Lys 645650 655 Asp Cys Pro Cys Ala Asp Lys Gly Thr Cys Cys Glu Lys Tyr Cys Gly660 665 670 Cys Ser Lys Ser Cys Lys Asn Lys Phe Arg Gly Cys His Cys AlaLys 675 680 685 Ser Gln Cys Arg Ser Arg Gln Cys Pro Cys Phe Ala Ala SerArg Glu 690 695 700 Cys Asp Pro Asp Val Cys Arg Asn Cys Trp Val Ser CysGly Asp Gly 705 710 715 720 Ser Leu Gly Glu Pro Leu Ala Arg Gly Asp GlyTyr Gln Cys Gly Asn 725 730 735 Met Lys Leu Leu Leu Lys Gln Gln Gln ArgIle Leu Leu Gly Arg Ser 740 745 750 Asp Val Ala Gly Trp Gly Ala Phe IleLys Asn Pro Val Asn Lys Asn 755 760 765 Asp Tyr Leu Gly Glu Tyr Thr GlyGlu Leu Ile Ser His Lys Glu Ala 770 775 780 Asp Lys Arg Gly Lys Ile TyrAsp Arg Ala Asn Ser Ser Phe Leu Phe 785 790 795 800 Asp Leu Asn Asp GlnTyr Val Leu Asp Ala Tyr Arg Lys Gly Asp Lys 805 810 815 Leu Lys Phe AlaAsn His Ser Ser Asn Pro Asn Cys Tyr Ala Lys Val 820 825 830 Met Leu ValAla Gly Asp His Arg Val Gly Ile Tyr Ala Lys Glu His 835 840 845 Ile GluAla Ser Glu Glu Leu Phe Tyr Asp Tyr Arg Tyr Gly Pro Asp 850 855 860 GlnAla Pro Ala Trp Ala Arg Arg Pro Glu Gly Ser Lys Lys Asp Glu 865 870 875880 Ala Ser Phe Ser His Arg Arg Ala His Lys Val Ala Arg 885 890 5 340PRT Zea mays 5 Met Ala Ser Ser Ser Lys Ala Ser Asp Ser Ser Gln Arg SerLys Arg 1 5 10 15 Ser Asp Gln Gly Met Gly Lys Asp Ala Ala Ala Ala SerVal Val Pro 20 25 30 Ile His Ala Asn Leu Thr Gln Leu Ile Arg Gln Val GlnSer Gly Arg 35 40 45 Leu Ala Tyr Ile Lys Glu Lys Leu Glu Val Asn Arg LysThr Leu Gln 50 55 60 Arg His Ser Cys Ser Leu Phe Asp Val Ala Ala Ala AlaGlu Val Ala 65 70 75 80 Ser Arg Gly Thr Asp Gly Gly Asn Ala Leu Ser GlnArg Ala Ala Glu 85 90 95 Arg Gln Cys Gly Ser Asp Leu Ala Asn Gly Ile GlyGlu Arg Asp Val 100 105 110 Val Ser Val His Glu Glu Asn Leu Ala Thr GlyThr Leu Ala Leu Ser 115 120 125 Ser Ser Gly Ala Thr Ala Gln Arg Thr IleVal Pro Val Arg Glu Ala 130 135 140 Ala Leu Val Glu Lys Ile Pro Pro TyrThr Thr Trp Ile Phe Leu Asp 145 150 155 160 Lys Asn Gln Arg Met Ala AspAsp Gln Ser Val Val Gly Arg Arg Arg 165 170 175 Ile Tyr Tyr Asp Thr ValGly Asn Glu Ala Leu Ile Cys Ser Asp Ser 180 185 190 Asp Glu Glu Ile ProGlu Pro Glu Glu Glu Lys His Phe Phe Thr Lys 195 200 205 Gly Glu Asp HisLeu Ile Trp Arg Ala Thr Gln Asp His Gly Leu Asn 210 215 220 Gln Glu ValVal Asn Val Leu Cys Gln Phe Ile Gly Ala Thr Pro Ser 225 230 235 240 GluIle Glu Glu Arg Ser Glu Val Leu Phe Glu Lys Asn Glu Lys His 245 250 255Ser Gly Ser Ser Asp Lys Ile Glu Ser Arg Leu Ser Leu Asp Lys Thr 260 265270 Met Asp Ala Val Leu Asp Ser Phe Asp Asn Leu Phe Cys Arg Arg Cys 275280 285 Leu Val Phe Asp Cys Arg Leu His Gly Cys Ser Gln Asn Leu Val Phe290 295 300 Pro Cys Glu Lys Gln Pro Tyr Ser Phe Asp Pro Asp Glu Asn LysLys 305 310 315 320 Pro Cys Gly His Leu Cys Tyr Leu Arg Leu Ser His ArgGln Lys Leu 325 330 335 Thr Ile Trp Ser 340 6 340 PRT Zea mays 6 Met AlaSer Ser Ser Lys Ala Ser Asp Ser Ser Gln Arg Ser Lys Arg 1 5 10 15 SerAsp Gln Gly Met Gly Lys Asp Ala Ala Ala Ala Ser Val Val Pro 20 25 30 IleHis Ala Asn Leu Thr Gln Leu Ile Arg Gln Val Gln Ser Gly Arg 35 40 45 LeuAla Tyr Ile Lys Glu Lys Leu Glu Val Asn Arg Lys Thr Leu Gln 50 55 60 ArgHis Ser Cys Ser Leu Phe Asp Val Ala Ala Ala Ala Glu Val Ala 65 70 75 80Ser Arg Gly Thr Asp Gly Gly Asn Ala Leu Ser Gln Arg Ala Ala Glu 85 90 95Arg Gln Cys Gly Ser Asp Leu Ala Asn Gly Ile Gly Glu Arg Asp Val 100 105110 Val Ser Val His Glu Glu Asn Leu Ala Thr Gly Thr Leu Ala Leu Ser 115120 125 Ser Ser Gly Ala Thr Ala Gln Arg Thr Ile Val Pro Val Arg Glu Ala130 135 140 Ala Leu Val Glu Lys Ile Pro Pro Tyr Thr Thr Trp Ile Phe LeuAsp 145 150 155 160 Lys Asn Gln Arg Met Ala Asp Asp Gln Ser Val Val GlyArg Arg Arg 165 170 175 Ile Tyr Tyr Asp Thr Val Gly Asn Glu Ala Leu IleCys Ser Asp Ser 180 185 190 Asp Glu Glu Ile Pro Glu Pro Glu Glu Glu LysHis Phe Phe Thr Lys 195 200 205 Gly Glu Asp His Leu Ile Trp Arg Ala ThrGln Asp His Gly Leu Asn 210 215 220 Gln Glu Val Val Asn Val Leu Cys GlnPhe Ile Gly Ala Thr Pro Ser 225 230 235 240 Glu Ile Glu Glu Arg Ser GluVal Leu Phe Glu Lys Asn Glu Lys His 245 250 255 Ser Gly Ser Ser Asp LysIle Glu Ser Arg Leu Ser Leu Asp Lys Thr 260 265 270 Met Asp Ala Val LeuAsp Ser Phe Asp Asn Leu Phe Cys Arg Arg Cys 275 280 285 Leu Val Phe AspCys Arg Leu His Gly Cys Ser Gln Asn Leu Val Phe 290 295 300 Pro Cys GluLys Gln Pro Tyr Ser Phe Asp Pro Asp Glu Asn Lys Lys 305 310 315 320 ProCys Gly His Leu Cys Tyr Leu Arg Leu Ser His Arg Gln Lys Leu 325 330 335Thr Ile Trp Ser 340 7 22 DNA Zea mays 7 gggtgtggtg atggtacatt gg 22 8 25DNA Zea mays 8 cagcttgtca cccattctgt atgcg 25 9 25 DNA Zea mays 9tgcctcgtcc ttctttgatc cttcg 25 10 25 DNA Zea mays 10 ctcacaaggaagcagacaaa cgcgg 25 11 25 DNA Zea mays 11 taccttggtg agtacactgg ggaac 2512 24 DNA Zea mays 12 ccatttcgtg tatcagacct aagc 24 13 18 DNA Zea mays13 catcaacgcc ctccaagc 18 14 25 DNA Zea mays 14 tgccacattc ttgaactgtcatccg 25 15 23 DNA Zea mays 15 gcacagtgac atcctcgaaa acg 23 16 17 DNAZea mays 16 gtccctgctc aattgcc 17 17 20 DNA Zea mays 17 gcggacaattgtgcggttcg 20 18 19 DNA Zea mays 18 ggttgttcac agaatttgg 19 19 26 DNAZea mays 19 cttcctaaca aaatcctttg ctgttg 26 20 18 DNA Zea mays 20ttgctccatg tagtcttg 18

What is claimed is:
 1. An isolated and purified nucleic acid comprisinga polynucleotide selected from the group consisting of SEQ ID NO: 1, SEQID NO: 3 and conservatively modified and polymorphic variants thereof.2. The isolated and purified nucleotide acid of claim 1, wherein thepolynucleotide is at least 15 nucleotides in length.
 3. An isolated andpurified nucleic acid comprising a polynucleotide having at least 60%identity to a polynucleotide selected from the group consisting of SEQID NO: 1 and SEQ ID NO:
 3. 4. An isolated and purified nucleic acidcomprising a polynucleotide having at least 70% identity to apolynucleotide selected from the group consisting of SEQ ID NO: 1 andSEQ ID NO:
 3. 5. An isolated and purified nucleic acid comprising apolynucleotide having at least 80% identity to a polynucleotide selectedfrom the group consisting of SEQ ID NO: 1 and SEQ ID NO:
 3. 6. Anisolated and purified nucleic acid comprising a polynucleotide having atleast 90% identity to a polynucleotide selected from the groupconsisting of SEQ ID NO: 1 and SEQ ID NO:
 3. 7. An isolated and purifiednucleic acid comprising a polynucleotide having at least 95% identity toa polynucleotide selected from the group consisting of SEQ ID NO: 1 andSEQ ID NO:
 3. 8. An isolated and purified polypeptide comprising anamino acid sequence selected from the group consisting of SEQ ID NO: 2,SEQ ID NO: 4 and conservatively modified and polymorphic variantsthereof.
 9. An isolated and purified polypeptide comprising an aminoacid sequence having at least 60% identity to an amino acid sequenceselected from the group consisting of SEQ ID NO: 2 and SEQ ID NO:
 4. 10.An isolated and purified polypeptide comprising an amino acid sequencehaving at least 70% identity to an amino acid sequence selected from thegroup consisting of SEQ ID NO: 2 and SEQ ID NO:
 4. 11. An isolated andpurified polypeptide comprising an amino acid sequence having at least80% identity to an amino acid sequence selected from the groupconsisting of SEQ ID NO: 2 and SEQ ID NO:
 4. 12. An isolated andpurified polypeptide comprising an amino acid sequence having at least90% identity to an amino acid sequence selected from the groupconsisting of SEQ ID NO: 2 and SEQ ID NO:
 4. 13. An isolated andpurified polypeptide comprising an amino acid sequence having at least95% identity to an amino acid sequence selected from the groupconsisting of SEQ ID NO: 2 and SEQ ID NO:
 4. 14. An expression cassettecomprising a promoter sequence operably linked to a nucleic acid ofclaim
 1. 15. The expression cassette of claim 14 further comprising apolyadenylation signal operably linked to the polynucleotide.
 16. Theexpression cassette of claim 14 wherein the promoter is a constitutiveor tissue specific promoter.
 17. A bacterial cell comprising theexpression cassette of claim
 14. 18. The bacterial cell of claim 17wherein the bacterial cell is an Agrobacterium tumefaciens cell or anAgrobacterium rhizogenes cell.
 19. A plant cell transformed with theexpression cassette of claim
 14. 20. A transformed plant containing theplant cell of claim
 19. 21. The transformed plant of claim 20 whereinthe plant is Zea mays.
 22. Seed from the transformed plant of claim 20.23. Transformed plant seed containing the plant cell of claim 20.